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Preface 



The International Symposium on Practical Aspects of Declarative Languages 
(PADL) focuses on practical applications of declarative languages. The collection 
of papers in this volume was presented at PADL 2001. The symposium was held 
in Las Vegas, Nevada, March 11-12, 2001. 

Forty papers were submitted in response to the call for papers. Twenty-three 
papers were finally selected for presentation at the symposium. The symposium 
included invited talks by Joe Armstrong of Bluetail, Raghu Ramakrishnan from 
the University of Wisconsin at Madison, and David S. Warren from the State 
University of New York at Stony Brook. 

The symposium was sponsored and organized by COMPULOG AMERICAS 
(http://www.cs.nmsu.edu/~complog), a network of research groups dedicated 
to promoting research in logic programming and related areas, by the Associa- 
tion for Logic Programming (http://www.cwi.nl/projects/alp), the Depart- 
ment of Computer Science, University of Texas at Dallas and the Department 
of Computer Science at the State University of New York at Stony Brook. The 
support of many individuals was crucial to the success of this symposium. My 
thanks to Giridhar Pemmasani, Samik Basu, Divyangi Anchan, Shachi Poddar, 
and Shabbir Dahodwala for their help with organizing and managing the re- 
viewing process. Special thanks to R.C. Sekar for setting up and managing the 
PADL 2001 web site and to Copal Gupta for handling all the organizational 
details. Many thanks to the program committee members for all their help in re- 
viewing and their advice. Finally, my thanks to all the authors who took interest 
in PADL 2001 and submitted papers. 
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A Model Checker for Value-Passing Mu-Calculus 
Using Logic Programming* 

C.R. Ramakrishnan 

Department of Computer Science, 

SUNY at Stony Brook 
Stony Brook, NY 11794-4400, USA 
cramScs . sunysb . edu 



Abstract. Recent advances in logic programming have been successfully 
used to build practical verification toolsets, as evidenced by the XMC 
system. Thus far, XMC has supported value-passing process languages, 
but has been limited to using the propositional fragment of modal mu- 
calculus as the property specihcation logic. In this paper, we explore the 
use of data variables in the property logic. In particular, we present value- 
passing modal mu-calculus, its formal semantics and describe a natural 
implementation of this semantics as a logic program. Since logic programs 
naturally deal with variables and substitutions, such an implementation 
need not pay any additional price — either in terms of performance, or in 
complexity of implementation — for having the added flexibility of data 
variables in the property logic. Our preliminary implementation supports 
this expectation. 

1 Introduction 

XMC is a toolset for specifying and verifying concurrent systems [RRS+00]. 
Verification in XMC is based on temporal- logic model checking [CES86]. In its 
current form, temporal properties are specified in the alternation-free fragment 
of the modal mu-calculus [Koz83]; and system models are specified in XL, a 
process language with data variables and values, based on Milner’s CCS [Mil89]. 
The computational components of the XMC system, namely, the compiler for 
the specification language, the model checker, and the evidence generator are 
built on top of the XSB tabled logic programming system [XSB] . 

XMC started out in late 1996 as a model checker for basic CCS — i.e., CCS 
without variables. Subsequently, we extended the model checker to XL (which 
has variables and values) by exploiting the power of the logic programming 
paradigm to manipulate and propagate substitutions. But the property logic 
has remained as the propositional (i.e., variable-free) modal mu-calculus. In this 
paper, we describe how the model checker in XMC can be extended to handle the 
value-passing modal mu-calculus, a logic that permits quantified data variables. 

To date, there have been two streams of work on model checking value- 
passing calculus. Rathke and Hennessy [RH97] develop a local model checking 

* Research supported in part by NSF grants EIA-9705998 and CCR-9876242. 
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algorithm, but consider constructs in the value-passing calculus that prevent any 
guarantees on the completeness of the algorithm. Mateescu [Mat98] also gives 
a local algorithm, but for the alternation-free fragment of the calculus. The 
performance of the algorithms are not discussed in either work, but Mateescu’s 
algorithm has been incorporated in the CADP verification toolkit [FGK+96]. 

In contrast to these two works, we consider a relatively simple but still ex- 
pressive set of value-passing constructs in the calculus, and describe a model 
checker that can be constructed with very minor changes to the propositional 
model checker. It should be noted that our model checker is also local. Further- 
more, the extensions proposed here can be applied to mu-calculus formulas of 
arbitrary alternation depth, and that too with little overhead for handling the 
propositional fragment of the logic. This is yet another illustration of the expres- 
siveness of the logic programming paradigm and its potential for improving the 
state of the art in an important application area. 

The rest of the paper is organized as follows. We begin with a description 
of modal mu-calculus, its semantics and its model checker as a logic program 
(Section 2). We then introduce the value-passing modal mu-calculus and its 
semantics (Section 3) and describe the changes needed in the model checker to 
support value passing (Section 3.3). We show that the performance of the model 
checker for the value-passing calculus matches the performance of the original 
model checker for the propositional case (Section 3.4). The work on value-passing 
logics raises interesting issues on model checking formulas with free variables, 
which can be used to query the models (Section 4). 

2 Propositional Modal Mu-Calculus 

The modal mu-calculus [Koz83] is an expressive temporal logic whose semantics 
is usually described over sets of states of labeled transition systems (LTSs). An 
LTS is a finite directed graph with nodes representing states, and edges repre- 
senting transitions between states. In addition, the edges are labeled with an 
action, which is a symbol from a finite alphabet. The LTS is encoded in a logic 
program by a set of facts trans (b'rc, Act, Best), where Src, Act, and Best 
are the source state, label and target state, respectively, of each transition. 

Preliminaries: We use the convention used in logic programming, writing vari- 
able names in upper case, function and predicate names in lower case. We use 
9 to denote substitutions, which map variables to terms over a chosen signature 
(usually the Her brand domain). By {X^t} we denote a substitution that maps 
variable X to term t. Application of a substitution 0 to a term t is denoted by 
t[9], and composition of substitutions is denoted by ‘o’. 

2.1 Syntax 

Formulas in the modal mu-calculus are written using the following syntax: 

— >Z\tt\ff\ ipy(fi I ipAip I {A):p I [A]lp I ^iZ.Lp \ vZ.:p 
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In the above, Z is drawn from a set of formula names and A is a set of actions; 
tt and jff are propositional constants; V and A are standard logical connectives; 
and {A)ip (possibly after action A formula ip holds) and [Ai\tp (necessarily after 
action A formula p holds) are modal operators. The formulas pZ.p and vZ.ip 
stand for least- and greatest- fixed points respectively. 

For example, a basic property, the absence of deadlock, is expressed in this 
logic by the following formula: 

j/df.[-{}]df A (-{})« (1) 

where ’ stands for set complement (and hence stands for the universal 
set of actions). The formula states that from every reachable state ([— {}]df) a 
transition is possible ((— {})tt) 

Fixed points may be nested. For instance, the property that a ‘6’ action is 
eventually possible from each state is written as: 

i/ib.[-{}]ib A {-{}){peh.{{b})tt V (-{6})eb) (2) 

The inner fixed point (involving the formula name eb) states that a ‘6’ tran- 
sition is eventually reachable, and the outer fixed point (involving the formula 
name ib) asserts that the inner formula is true in all states. 

Apart from nesting, the inner fixed point may refer to the formula name 
defined by the outer fixed point. Such formulas are called alternating fixed points. 
For instance, the property that a ‘c’ action is enabled infinitely often on all 
infinite paths is written as: 

iyax.pay.[—{}]{{{{c})tt A ax) V ay) (3) 



2.2 Semantics 

Given the above syntax of mu-calculus formulas, we can talk about free and 
bound formula names. For instance, in the alternating fixed point formula given 
above, consider the inner least fixed point subformula: in that subformula, the 
name ax occurs free and ay is bound. To associate a meaning with each formula, 
we consider environments that determine the meanings of the free names in a 
formula. 

A mu-calculus formula’s semantics is given in terms of a set of LTS states. 
The environments map each formula name to a set of LTS states. We denote the 
semantics of a formula ip as where a is the environment. 

The formal semantics of propositional mu-calculus is given in Figure 1. The 
set U denotes the set of all states in a given LTS, whose transition relation 
is represented by trans/3. The equations defining the semantics of boolean 
operations, as well as the existential and universal modalities are straightforward. 
The least fixed point is defined as the intersection (i.e., the smallest) of all the 
pre-fixed points, while the greatest fixed point is defined as the union (i.e, the 
largest) of all post-fixed points. 
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Fig. 1. Semantics of propositional modal mu-calculus 



2.3 Model Checker as a Logic Program 

We now describe how a model checker for modal mu-calculus can be encoded 
as a logic program. We first outline the representation of mu-calculus formulas, 
and then derive a model checker based on the semantics in Figure 1. 

Syntax We represent modal mu-calculus formulas by a set of fixed point equa- 
tions, analogous to the way in which lambda-calculus terms are presented using 
the combinator notation. We denote least fixed point equations by += and great- 
est fixed point equations by -=. We also mark the use of formula names by 
enclosing it in a form(-) constructor. The syntax of encoding of mu-calculus 
formulas is given by the following grammar: 

F — > form(.Z) | tt | ff | F \/ F \ F /\ F \ dicun(yl, F) \ box(^, F) 

D — > Z += F (least fixed point) 

I Z -= F (greatest fixed point) 

Nested fixed point formulas are encoded as a set of fixed point equations. For 
instance, the nested fixed point formula for “always eventually 6” (Formula (2)) 
is written in equational form as: 

ib -= box(-{}, form(ib)) /\ diam(-{}, form(eb)) (4) 

eb += diamCfb}, tt) \/ diam(-{b}, form(eb)) 

Alternating fixed point formulas can be captured in equational form by ex- 
plicitly parameterizing each fixed point by the enclosing formula names. For 
instance, the property “infinitely often c” (Formula (3)) is written in equational 
form as: 

ax -= f ormCay (f orm(ax) ) ) (5) 

ay(AX) += box(-{}, (diamCfc}, tt) /\ AX)) \/ f orm(ay (AX) ) 
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Semantics Based on the semantics in Figure 1, a model checker for propo- 
sitional modal mu-calculus can be encoded using a predicate models/2 which 
verifies whether a state in a LTS models a given formula. For encoding the se- 
mantics note that the existential quantifier and the least fixed point computation 
can be inherited directly from the Horn clause notation and tabled resolution 
(minimal model computation) respectively. The encoding of a model checker for 
this sublogic is given in Figure 2. 



models (State_S, tt) . 

models (State_S, (FI \/ F2)) 
models (State_S, FI) ; 
models (State_S, F2) . 

models (State_S, (FI /\ F2)) 
models (State_S, FI), 
models (State_S, F2) . 

models (State_S, diam(As, F)) 

trans(State_S, Action, State_T) , 
member (Action, As), 
models (State_T, F) . 

models (State_S, form(FName)) 

FName += Fexp, 
models (State_S, Fexp). 



Fig. 2. A model checker for a fragment of propositional mu-calculus 



In order to derive a model checker for the remainder of the logic, two key 
issues need to be addressed: (i) an encoding of the ‘V’ quantifier which is used 
in the definition of the universal modality, and (ii) a mechanism for computing 
the greatest fixed points. 

Encoding Greatest Fixed Point Computation: The greatest fixed point 
computation can be encoded in terms of its dual least fixed point computation 
using the following identity: 

vZ.Lp = — ^Z'Y\ 

For alternation-free mu-calculus formulas, observe that the negations between 
a binding occurrence of a variable and its bound occurrence can be eliminated 
by reducing the formula to negation normal form. For instance, the nested fixed 
point formula above (Formula (4)) can be encoded using least fixed point oper- 
ators and negation (denoted by neg(O) as: 
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ib += neg(f orm(nib) ) 

nib += dicun(-{}, form(nib)) \/ box(-{}, neg(f orm(eb) ) 
eb += diam({b}, tt) \/ diajn(-{b}, form(eb)) 

Thus, there are no cycles through negation for alternation- free formulas. For for- 
mulas with alternation, the transformation introduces cycles through negation. 
For instance, the alternating fixed point formula originally encoded as 

ax -= f ormCay (f orm(ax) ) ) 

ay(AX) += box(-{}, (diamCfc}, tt) /\ AX)) \/ form(ay(AX)) 

can be encoded using least fixed point operators and negation as 
ax += neg(f orm(nax) ) 

nax += neg(f ormCay ( neg(f orm(nax) ) ))) 

ay(AX) += box(-{}, (diamCfc}, tt) /\ AX)) \/ form(ay(AX)) 

Note the cycle through negation in the definition of nax that cannot be elimi- 
nated. Given that all the currently-known algorithms for model checking alter- 
nating formulas are exponential in the depth of alternation, it appears highly 
unlikely that there is some formulation of this problem in terms of negation that 
avoids negative cycles. 

Cycles and Negation: Logic programs where predicates are not (pairwise) mu- 
tually dependent on each other via negation are called stratified. A stratified 
program has a unique least model which coincides with its well-founded 
model [vRS91], as well as its stable model [GL88]. A non-stratified program 
(i.e., where there are cycles through negation) may have multiple stable mod- 
els or none at all; whereas it has a unique (possibly three-valued) well-founded 
model. For non-stratified programs, well founded models can be computed in 
polynomial time [GW96], while determining the presence of stable models is 
NP-complete. Hence, we avoid cycles through negation in our encoding wher- 
ever possible. 



f or all (_Bottnd bars , Antecedent, Consequent) 

hagol (.Consequent , Antecedent, ConsequentList) , 
all.true ( ConsequentList ) . 

all.true ( [] ) . 
all_true( [Goa/ I Rest)) 
call ( Goal) , 
all_true(i?est) . 



Fig. 3. Implementing forall/3 using Prolog builtins 





A Model Checker for Value-Passing Mu-Calculus Using Logic Programming 



7 



Encoding the Universal Quantifier: The universal quantifier can be cast 
in terms of its dual existential quantifier using negation, but this can intro- 
duce cycles through negation even for alternation-free mu-calculus formulas. For 
instance, in the formula expressing deadlock-freedom property (Formula (1)) re- 
placing the box-modality with its dual using negation will result in the following 
encoding: 

df -= neg( diam(-{}, neg(form(df ) ) ) ) \/ dicun(-{}, tt) 

Hence we retain the box-modality and use an explicit programming construct 
forall/3 to encode the model checker. The forall/3 construct can itself be 
implemented using other Prolog builtins as shown in Figure 3. It should be 
noted that the above implementation of forall/3 is correct only when there are 
no free variables. In the presence of free variables, one needs to keep track of 
their substitutions, and in general, disequality constraints on their substitutions. 
However, for the mu-calculus model checkers considered in this paper, we can 
ensure that there are no free variables in any use of forall/3, and hence this 
simple implementation suffices. When there are no free variables, we do not need 
to keep track of bound variables either, and hence the Prolog variable BoundVars 
in the above implementation is treated like an anonymous variable. 

The encoding of the model checker for the remainder of the logic is shown 
in Figure 4. Taken together with Figure 2, the encoding reduces model checking 
to logic-program query evaluation: verifying whether a state S models a for- 
mula F is done by issuing the query models (S', E). By using a goal-directed 
query evaluation mechanism, we ensure that the resultant model checker ex- 
plores only a portion of the state space that is sufficient to prove or disprove a 
property [RRR+97]. 

3 Value-Passing Modal Mu-Calculus 

We consider value-passing modal mu-calculus where the actions in modalities 
may be terms with data variables, and the value of these variables can be 
“tested” using predicates. The quantification of a data variable is determined 
by the modality in which it first appears. A data variable bound by a dia- 
mond modality is existentially quantified, and that bound by a box modality 
is universally quantified. The scope of the modality is same as the scope of the 
quantification. 

The presence of data variables makes property specification concise, and, 
more importantly, independent of the underlying transition system. For instance, 
consider the specification of a property of a protocol that states that every mes- 
sage sent will be eventually received, where different messages are distinguished 
by identifiers ranging from 1 to some integer n. This property is expressed by 
the formulas in Figure 5, encoded in logics without (non-value-passing) and 
with (value-passing) data variables. Consider the non-value-passing case with 
the number of distinct messages (i.e., n in the figure) is two. The formula str 
states that after every si (i.e., send of message 1) rcvl must hold; and after 
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models (State_S, box(As, F)) :- 

f orall(State_T, 

(trans(State_S, Action, State_T) , member (Action, As)), 
models (State_T, F)). 


models (State_S, form(FName)) :- 
FName -= Fexp, 
negate (Fexp, NegFexp) , 
tnot (models (State_S, NegFexp)). 




models (State_S, neg(f orm(Fname) ) ) :- 
FName += Fexp, 

tnot (models (State_S , Fexp) ) . 




models (State_S, neg(f orm(Fname) ) ) :- 
FName -= Fexp, 
negate (Fexp, NegFexp), 
models (State_S, NegFexp). 




negate (tt, ff). 
negate (ff, tt) . 

negate (FI /\ F2, G1 \/ G2) :- negate (FI, Gl) , negate (F2, 
negate (FI \/ F2, Gl /\ G2) :- negate (FI, Gl) , negate (F2, 
negate (diam(A, F) , box(A, G) ) :- negate(F, G) . 
negate(box(A, F) , diam(A, G) ) :- negate(F, G) . 
negate (form(FName) , neg(f orm(FName) ) ) . 


G2) . 
G2) . 



Fig. 4. A model checker for the remainder of the propositional mu-calculus 



str -= box({sl}, form(rcvl)) 

/\ box({s2}, form(rcv2)) ••• 

/\ box({s„}, form(rcvn)) 

Without box(-{sl, s2, s„}, form(str)) 

Value-passing: += box(-{rl}, form(rcvl)) 

rcv2 += box(-{r2}, form(rcv2)) 



rcv„ += box(-{r„}, form(rcv„)) 



With 

Value-passing: 



str -= box({s(X)}, f orm(rcv(X) ) ) /\ 

box(-{s(_)}, form(str)) 
rcv(X) += box(-{r(X)}, f orm(rcv(X) ) ) 



Fig. 5. Mu-calculus formulas with and without value-passing 








A Model Checker for Value-Passing Mu-Calculus Using Logic Programming 



9 



every s2 (i.e., send of message 2) rcv2 must hold; and after every non-send ac- 
tion str itself holds. The formula str is a greatest fixed point equation since 
the property holds on all (infinite) paths of evolution of the system that contain 
no sends. The formula rcvl (rcv2) states that the action rl (r2) is eventually 
enabled on all evolutions of the system. 

Note that in the non-value-passing case, we have to enumerate the rcv^ for 
each i. In contrast, we can state the property in a value-passing logic with a 
single formula using variables quantified over [l,n]. As the example shows, the 
formula need not even specify the domain of quantification if the underlying 
system generates only values in that domain. 

3.1 Syntax 

Before we describe the semantics of the value-passing logic, we extend our defi- 
nition of LTSs, to include labels that are drawn from a finite set of ground terms 
(i.e., terms without variables) instead of just atoms. Similarly, we extend the 
syntax of the logic to have actions in modalities range over arbitrary (ground 
or nonground) terms. The syntax of the value-passing logic is thus extended as 
follows. 

We divide the set of symbols into four disjoint sets: variables, predicate sym- 
bols, function symbols, and formula names. Terms built over these symbols are 
such that the formula names and predicate symbols occur only at the root. Let 
F represent terms with function names at root; P represent terms with predi- 
cate symbols at root; A represent a set of terms with function symbols at root. 
The set of predicate symbols include the two base propositions tt and ff. Then 
value-passing mu-calculus formulas are given by the following syntax: 

Ip — > F \ P I ‘ipy p) I Ip Alp I (^)V’ I [A]tp I fiF.tp I vF.ip 

A variable that occurs for the first time in a modality is bound at that occurrence. 
The scope of the binding spans the scope of the modality. In the following, we 
consider only value-passing formulas that are closed: i.e., those that contain no 
free variables. 

3.2 Semantics 

As in the propositional case, the semantics of value-passing formulas is given in 
terms of a set of LTS states. However, due to the presence of data variables, we 
split the environment into two parts: cr that maps each formula name to a set 
of LTS states, and 9 that maps each data variable to a term. We denote the 
semantics of a formula ip as \ip\^ g, where cr, 9 are the pair of environments. The 
semantics of value-passing modal mu-calculus is given in Figure 6. The salient 
aspect of value passing in the logic is that the values of data variables are picked 
up from the labels in the underlying LTS. This is captured in the semantics of 
diamond and box modalities by picking up a substitution 9' that matches an 
action in the set A with some label in the LTS, and evaluating the remaining 
formula under the effect of this substitution (i.e., 9' o 9). 
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/ ^ 


if P9 is 


true 




Ah 


if Pe is 


false 






= <r(F) 






IV’i 




= I^lL 


,e U [^2], 


T,e 


IV’i 




= I^iL 


,e n [^2], 


r.9 



g = {s I 3a e .4, and substitution 6' such that 

trans(s,a0' ,t) and t £ ^ e'oe 

|[.4]i/)]^ g = {s I Va £ .4, and substitution 6' such that 

trans (s,a6»',t) £ [V’L.e/os} 

= n{S I m^F^s}o.,e ^ S} 

= U{S I S C 



} 



Fig. 6. Semantics of value-passing modal mu-calculus 



3.3 Model Checking the Value-Passing Modal Mu-Calculus 

We first extend the syntax of our encoding of mu-calculus (Section 2.3) by re- 
placing the propositions tt and ff by the more general pred(P) where P is a 
term representing a predicate (e.g., pred(X=Y)). Note that, in our encoding, for- 
mula names are already terms (to accommodate parameters used for alternating 
fixed points). 

From the semantics of the propositional calculus (Figure 1) and that of the 
value-passing calculus (Figure 6), observe that a model checker for the value- 
passing case needs to (i) maintain and propagate the substitutions for data 
variables; and (ii) evaluate predicates defined over these variables. Note that 
when these semantic equations are implemented by a logic programming sys- 
tem, no additional mechanism is needed to propagate the substitutions on data 
variables. In other words, an environment that maps variables to values is al- 
ready maintained by a logic programming engine. Hence, a model checker for 
the value-passing case can be derived from that for the propositional case by 
replacing the tt rule (first clause in Figure 2) with the following rule: 

models (State_s , pred(Pred)) call(Pred). 

Also, note that since the labels of an LTS are ground terms, the query 
transCs, a, t) for any given s is always safe; i.e., a and t are ground terms upon 
return from the query. This ensures that every call to models/2 is ground, pro- 
vided the initial query is ground. Hence, the model checker for the value passing 
calculus can be evaluated using without the need for constraint processing on any 
logic programming system that is complete for programs with bounded term- 
size property. In fact, since every call to models/2 is ground, there are no free 
variables to worry about when using the forall construct — we can simply use 
the implementation shown in Figure 3. 
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3.4 Experimental Results 

From the relatively minor change to the definition of models/2, it is easy to see 
that the performance of the model checker for the value-passing calculus, when 
used on a propositional formula, will be no worse than the specialized model 
checker for the propositional case. The interesting question then is to compare 
the performance of the two model checkers on formulas that can be expressed 
in the propositional calculus but more compactly in the value-passing calculus 
(e.g., see Figure 5). 

Table 1 summarizes the performance of the value-passing model checker and 
the propositional model checker on the property shown in Figure 5. The mea- 
surements were taken on a 600MHz Pentium III with 256MB running Linux 2.2. 
We checked the validity of that property for different values of domain size (n in 
that figure) on a specification of a two-link alternating bit protocol (ABP). The 
two-link version of the protocol is obtained by cascading two ABP specifications, 
connecting the receiver process of one link to the sender of the next. We chose 
the two-link version since the single-link ABP is too small for any meaningful 
performance measurement. 

The space and time performance from the table shows that the value-passing 
model checker performs as well as the propositional one; the difference in speeds 
can be attributed to encoding used in the propositional formula, where a modal- 
ity with a variable is expanded to a sequence of explicit conjunctions or disjunc- 
tions. More experiments are needed to determine whether the succinctness of 
value-passing formula does indeed have an impact on performance. 



Domain Size 


Propositional MC 


Value-passing MC 


Time 


Space 


Time 


Space 


2 


4.6s 


4.3M 


4.3s 


4.5M 


3 


12.9s 


8.2M 


11.6s 


8.5M 


4 


24.1s 


13.0M 


20.9s 


13.5M 


5 


39.8s 


19. IM 


32.6s 


19.8M 


6 


56.7s 


26. 2M 


47.5s 


27.0M 



Table 1. Performance of propositional and value-passing model checkers 



4 Conclusions and Future Work 

We showed how the power of logic programming for handling variables and sub- 
stitutions can be used to implement model checkers for value-passing property 
logics with very little additional effort and performance penalty. 

Two crucial — although common — restrictions we placed on the property 
logics and transition systems contributed to this simplicity. First, we considered 
only closed formulas in the property logics. This restriction is also placed in 
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the other works on value-passing logics [RH97, Mat98]. With this restriction, 
we ensured that the model checking query still produces only an yes/no answer. 
The interesting problem that we are currently investigating is whether model 
checking can truly be a query: i.e., find substitutions for free variables in the 
property formula that make the property true. This problem is inspired by the 
recent work on temporal logic queries [ChaOO]. It turns out (as is to be expected) 
that the context in which a variable occurs in a formula dictates whether that 
query evaluation can be done without the use of additional mechanisms such as 
constraint handling. 

Secondly, we considered only ground transition systems, where the states 
of the system and labels on the transitions were ground terms. Most verifica- 
tion tools allow only ground transition systems to be described, so this is not an 
unusual restriction. For instance, Mateescu [Mat98] considers only ground transi- 
tion systems in the context of value-passing logics; Rathke and Hennessy [RH97], 
on the other hand, considers nonground systems. Ground transition systems 
meant that model checker would terminate even for value-passing logics, since 
the substitutions for variables in the logics were picked up from the terms oc- 
curring in the transition system. Relaxing this restriction, to allow symbolic 
transition systems where transition labels and states may be nonground terms, 
will allow us to model check individual modules of a specification. This can be 
done either by allowing the variables in the property logic to be typed (from 
finite types, to ensure termination), or by evaluating the model checker under a 
constraint logic programming system. 

In this paper, we focused on aspects of value-passing that can be handled 
without constraint processing machinery. This work, as well as our concurrent 
work on bisimulation of value-passing systems [MRRVOO], indicate that verifica- 
tion tools for general value-passing system can be built by judiciously combining 
tabulation and constraint processing. The performance implications of using 
constraints, however, remain to be explored. 
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Abstract. We describe practical experiences of using a logic program- 
ming based approach to model and reason about concurrent systems. 
We argue that logic programming is a good foundation for developing, 
prototyping, and animating new specification languages. In particular, 
we present the new high-level specification language CSP(LP), unifying 
CSP with concurrent (constraint) logic programming, and which we hope 
can be used to formally reason both about logical and concurrent aspects 
of critical systems. 

Keywords: Specification, Verification, Concurrency, Implementation, 
and Compilation. 



1 Introduction and Summary 

This (position) paper describes work carried out within the recently started 
projects ABCD^ and iMoc.^ The objective of the ABCD project is to increase 
the uptake of formal methods in the business critical systems industry by low- 
ering the cost of entry and increasing the benefits of using formal modelling. 
The focus is on support of system definition and architectural design so that 
the systems integrator can more easily model systems and validate proposed 
system architectures. A main point is to apply formal methods — such as model 
checking — early on in the software development cycle, i.e., on high-level specifi- 
cations or high-level prototypes of the particular business critical system to be 
developed. 

To effectively lower the cost of entry for industrial users, we want a specifi- 
cation and/or prototyping language which is 
- powerful and high-level enough to specify business and safety critical systems 
in a natural and succinct manner. For example, we want sophisticated data- 
structures and do not want to force the architect to have to come up with 
artificial abstractions himself. 

^ “Automated validation of Business Critical systems using Component-based Design,” 
EPSRC grant GR/M91013, with industrial collaborators IBM, ICL, Praxis Critical 
Systems, and Roke Manor Research. 

^ “Infinite state MOdel Checking using partial evaluation and abstract interpretation,” 
EPSRC grant GR/N11667. 

I.V. Ramakrishnan (Ed.): PADL2001, LNCS 1990, pp. 14-28, 2001. 

@ Springer- Verlag Berlin Heidelberg 2001 
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- expressive enough to easily express and reason about concurrency. Ideally, 
one would want to compose larger systems from smaller ones, e.g., by putting 
them in parallel and allowing various forms of interaction (synchronisation, 
asynchronous message passing, shared memory,. . . ), allowing encapsulation 
(hiding,...) and re-use (renaming, parameterisation,. . . ). Ideally, one may 
also want to consider timing aspects (timeouts, interrupts,. . . ). 

- usable by industrial partners (graphical notation) and reduces the potential 
of errors (types, consistency checking,. . . ) 

To increase the benefits, we want the systems architect to be able to probe 
and examine his specification or prototype, e.g., via animation, and automatic 
verification, such as model checking. 

Currently, the specification language issue within ABCD is still under inves- 
tigation. First case studies were carried out using the B-method [1] and CSP 
with functions and datatypes [15, 29, 9]. Other languages and formalisms, such 
as Petri nets, the C-like language Promela of Spin [16], the Alloy language of Al- 
coa [17], LOTOS, are being studied (see also [13]). However, the language issue 
is far from fixed, and we wish to experiment with various extensions and inte- 
grations ([4]) of the paradigms. One might also want to consider domain-specific 
specification languages (for the ABC project with IBM a domain specific com- 
pensation operator has been developed for reliable transaction processing [8]). 
As we will show in this paper, an approach based on logic programming and 
partial evaluation seems very promising for this problem: 

- a lot of languages can be easily encoded in logic programming, because of 
features such as non-determinism, unification, co-routining, constraints,. . . 

~ one can get compilers using offline partial evaluators such as logen [18]. 
One can do more sophisticated optimisations using online partial evaluation 
systems such as mixtus [30], SP [10], or ecce [22]. One can also apply 
existing analysers to infer properties about the source program (see [26]). 

- one can easily animate such languages in existing Prolog systems, 

- one can do finite and infinite state model checking [27, 6], [7], [23], [25]. 

All of this points will be substantiated in this paper, through a non-trivial high- 
level specification language. The design of this new language CSP(LP) was heav- 
ily influenced by the possibilities of (constraint) logic programming. In partic- 
ular, CSP(LP) supports, amongst others, complex datatypes, constraints, con- 
currency, message passing via channels a la CSP and CCS, and asynchronous 
message passing via streams. We hope that CSP(LP) is a contribution in its 
own right, and can be used to formally reason both about logical and concurrent 
aspects of critical systems. 

2 CSP, CSP-FDR, and CSP(LP) 

2.1 Elementary CSP 

CSP is a process algebra defined by Hoare [15]. The first semantics associated 
with CSP was a denotational semantics in terms of traces, failures and (failure 
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and) divergences. Elementary CSP (without datatypes, functions, or other ad- 
vanced operators) can be defined as follows. Given S, a finite or enumerable set 
of actions (which we will henceforth denote by lower case letters a, 6, c, . . .), and 
X, an enumerable set of variables or processes (which we henceforth denote by 
identifiers such as Q, i?, . . ., or MYPROCESS starting with an uppercase letter), 
the syntax of a basic CSP expression is defined by the following grammar (where 
A denotes a set of actions): 

P ::= STOP (deadlock) | SKIP (success) | 

a ^ P (prefix) | P r\ P (internal choice) | 

P n P (external choice) | P |A| P (parallel composition) | 
(hiding) | Q (instantiation of a process) 

Moreover, each process Q used must have a (possibly recursive) definition Q = P. 

We suppose that all used processes are defined by at least one recursive 
definition (if there is more than one definition this is seen to be like an external 
choice of all the right-hand sides). In the following, we also suppose the alphabet 
S to be finite. 

Intuitively, a ^ P means that the system proposes the action a to its en- 
vironment, which can decide to execute it. The external choice is resolved by 
the environment (except when two branches propose the same action, where a 
nondeterministic choice is taken in case the environment chooses that action). 
Internal choice is made by the system without any control from the environment. 
P 1^1 Q is the generalized parallel operator of [29], and means that the process 
P synchronizes with Q on any action in the set of actions A. If an action outside 
A is enabled in P or Q, it can occur without synchronization of both processes. 
Pure interleaving P | 0 | Q is denoted by P ||| Q. Pure synchronization P |Z'| Q 
is denoted by P || Q. The hiding operator P\A makes any visible action a € A 
of P invisible. In the the operational semantics [29] the latter (as well as the 
internal choice) is achieved using the internal action t. 

A major practical difference^ with CCS [24] is that CSP allows for synchro- 
nisation of an arbitrary number of processes (while CCS only supports binary 
synchronisation). This makes CSP processes more difficult to implement, but on 
the other hand more suitable as the basis of a high-level specification language 
(which is what we are after in this paper). 



2.2 CSP-FDR and CSP(LP) 

The elementary CSP presented above is not very useful in practice, either as a 
programming or a specification language, because its lack of value passing and 
the absence of more refined operators. [29, 9, 31] hence presents an extension of 
CSP (which we henceforth call CSP-FDR), along with a machine-readable ASCII 
syntax, which is usable in practice. CSP-FDR and the tools fdr and probe 
are used by government institutions and companies, e.g., in the semiconductor, 

® There are plenty of other differences of course. For example, CSP has been devel- 
oped with denotational semantics in mind, while CCS is more tightly linked with 
bisimulation. 
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defence, aerospace, and security areas (for the latter, see [28]). In this extension 
of CSP one can: 

- pass tuples of datavalues on channels. For example, all ^ STOP will output 
the datavalue 1 on the channel a, while alx P{x) will read a datavalue 
on channel a and bind x to the value read. One can also put additional 
constraints on values that are received on a channel, as in alx : x > 2 ^ P{x). 

- use constructs such as the if-then-else, let-constructs, and operators such as 
interrupt and timeout 

- use sets and set operations, sequences and sequence operations, integers and 
arithmetic operators, tuples, enumerated types, ranges, ... 

This extension of CSP was heavily influenced by functional programming 
languages, and hence relies on pattern matching as a means to synchronise on 
channels. As a consequence, alx P{x) can not synchronise with aly ^ Qiv)- 
In the remainder of this paper we show how, by using logic rather than func- 
tional programming as our foundation, we can overcome this limitation, thus 
leading to a more powerful language CSP(LP) which reconciles CSP-FDR with 
(concurrent) logic programming languages. 

The basic syntax of CSP(LP) is summarised in Figure 1. The semantics of 
CSP(LP) will become clear in the next section, where we show how it can be 
mapped to logic programming. 



Operator 


Syntax 


Ascii Syntax 


stop 


STOP 


STOP 


skip 


SKIP 


SKIP 


prefix 


a ^ Q 


a->P 


conditional prefix 


alx : X > 1 —> P 


a?x:x>l->P 


external choice 


POQ 


P [] Q 


internal choice 


P n Q 


P n Q 


interleaving 


P\\\Q 


P 1 1 1 Q 


parallel composition 


P\AIQ 


P [| A |] Q 


sequential composition P; Q 


P -» Q 


hiding 


P\A 


P W A 


renaming 


P[R] 


P [[ R ]] 


timeout 


PoQ 


P [> Q 


interrupt 


PAiQ 


P /\ Q 


if then else 


if t then P else Q 


if T then P else Q 


let expressions 


let V = e in P 


let V=E in P 


agent definition 


A = P 


A = P; 



Fig. 1. Summary of syntax of CSP(LP) 
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3 CSP(LP) and Logic Programming 

3.1 Implementation 

In this section we present an operational semantics of CSP(LP). Rather than 
using natural deduction style rules (as in [29]) we immediately present the (de- 
clarative) Prolog code. This code implements a ternary relation trans, where 
trans{e,a,e') means that the CSP(LP) expression e can evolve into the expres- 
sion e' by performing the action a. Our language is not committed choice, there 
can be multiple solutions for the same e, and the order of those solutions is not 
relevant. Apart from a few minor additions, the rules below basically implement 
the operational semantics of [29]; the main difference lies in the generalisation 
of the synchronisation mechanism. 



Basic Operators First, the elementary processes STOP and SKIP are ex- 
tremely straightforward to encode: 

trans (stop, fail, 
trans (skip, tick, stop) . 

The unconstrained prefix operator is not much more difficult. Observe that 
the first argument to prefix is a tuple of values (annotated by either “!”, “?”, or 
“.”). The constrained prefix operator is slightly more complicated. The actual 
implementation of the predicate test actually makes use of the co-routining and 
constraints facilities of SICStus Prolog. Notably, we use, e.g., the SICStus dif 
predicate, which delays if the arguments are not sufficiently instantiated. 

trans (pref ix (V, Ch, X) , io(V,Ch) ,X) . 

trans (prefix (V,Ch, Constraint ,X) , io(V,Ch) ,X) test (Constraint) . 

The CSP choice operators □ , □ are implemented exactly as in [29]. The 
rules for the external choice might seem a bit surprising at first, but they are 
needed to ensure that r — > P behaves like P {t ^ P is equivalent to P in the 
CSP semantics). 

trans (int_choice(X,_Y) ,tau,X) . 
trans(int_choice(_X,Y) ,tau,Y) . 

trans(ext_choice(X,_Y) ,A,X1) trans(X, A,X1) ,dif (A,tau) . 

trans(ext_choice(_X,Y) ,A,Y1) trans(Y, A,Y1) ,dif (A,tau) . 

trans (ext_choice(X,Y) , tan, ext_choice(Xl ,Y) ) trans(X,tau,Xl) . 

trans (ext_choice(X,Y) , tan, ext_choice(X,Yl) ) trans(Y,tau,Yl) . 

One can also implement the CCS style choice operator -|- (which has a sim- 
pler operational semantics, but it treats r — > P differently from P and hence 
neither the CSP failures-divergences semantics nor weak bisimulation [24] is a 
congruence for -I-): 
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trans(ccs_choice(X,_Y) ,A,X1) trans(X, A,X1) . 

trans(ccs_choice(_X,Y) ,A,Y1) trans(Y, A,Y1) . 

Implementing sequential composition is again pretty straightforward: 

trans(seq(P,Q) ,A,seq(Pl,Q)) trans(P,A,Pl) , dif(A=tick). 
trans(seq(P,Q) ,tau,Q) trcms(P,tick,_) . 

Finally, the timeout and interrupt operators c>, are implemented as follows: 

trans (timeout (P, _Q) , A, PI) dif (A,tau) ,trans(P,A,Pl) . 
trans (timeout (P, Q) ,tau, timeout (PI , Q) ) trans(P,tau,Pl) . 

trans (timeout (_P , Q) ,tau,Q) . 

trans (interrupt (P, Q) , A, interrupt (PI , Q) ) : - dif (A, tick) ,trans(P , A,P1) . 

trans(interrupt(P,Q) .tick, omega) trans(P,tick,_) . 
trans (interrupt (P.Q) ,i,Q) . 



Agent Calls and Recursion When implementing agent calls to recursive 
definitions one has to be extremely careful in the presence of divergence and 
multiple agent equations. 

Suppose that we have a set of agent /2 facts which represent all agent defini- 
tion equations. These facts are generated from the CSP(LP) Ascii syntax using a 
parser. A first prototype predictive recursive descent parser has been developed 
in SICStus Prolog itself using DCG’s. 

If there are multiple equations for the same agent, then the entire agent 
should be seen as having an external choice between all individual equations. 
Now, the following piece of code seems the natural solution: 

trans(agent_call(X) ,A,NewExpr) 

evaluate_agent_call(X,EX) , agent (EX, AE) ,trans(AE, A.NewExpr) . 

This implementation is rather efficient, amenable to techniques such as par- 
tial evaluation (cf. Section 4.2), and the scoping of parameters works as ex- 
pected. It is also possible to add delay declarations to ensure that a call is only 
unfolded if its arguments are sufficiently instantiated.® However, the above code 
is not always correct: 

- when an infinite number of calls without visible action is possible the inter- 
preter might loop, instead of generating a divergent transition system. 

- it effectively treats all equations as being in a big CCS choice rather than 
being within an external choice. The interpreter is thus only correct for agent 
definitions which never produce a r action as their first action. 

^ Unlike the CSP in FDR [9], where the out ! 1 within P(out) = out!l->STDP does 
not refer to the parameter out if a global channel out exists. 

® Although this can be semantically tricky: e.g., what is the logical meaning of a 
floundering derivation? 
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The easiest way to solve this problem is to impose restrictions on the equa- 
tions to ensure that the above problems do not arise (similar to what is done 
in FDR [9]). If the restrictions are not satisfied one has to generate an explicit 
external choice and possibly explicit r actions as well. This is not a practical 
problem, however, as the translation can be done automatically by the parser. 



Let Expressions and Conditionals Because of the absence of recursion, let 
expressions and conditionals are fortunately much less problematic to implement 
(below \+ denotes negation): 

trans(let(V,VExp,CExp) ,tau,CExp) evaluate_argument(VExp,Val) ,V=Val. 

trans (if (Test ,Then, _Else) , A,X1) test(Test), trans(Then,A,Xl) . 
trans (if (Test ,_Then,Else) , A,X1) \+(test(Test)) , tr£ms(Else,A,Xl) . 
trans (if (Test , Then) , A, XI) test(Test), trans(Then,A,Xl) . 



Hiding, Renaming, and Restriction Hiding can be implemented by replac- 
ing visible actions by the special r action: 

trans (hide (Expr, CList) , A, hide(X,CList) ) 

tr£ms(Expr , A,X) ,dif (A, tick) ,not_hidden(A, CList) . 
trans (hide (Expr, CList) , tan, hide(X, CList) ) 
trans(Expr , A,X) ,hidden(A, CList) . 
trans (hide (Expr, _CList) , tick, omega) trans(Expr,tick,_X) . 

hidden(A, CList) checks whether the channels of A appear within the chan- 
nel list CList and notJiidden is its sound negation, implemented using dif . 

The following implements renaming. Note that, because of our logic program- 
ming foundation, we can quite easily implement relational renaming [29].® 

trans (rename (Expr, RenList) , RA, rename (X,RenList) ) 
trans(Expr , A,X) , rename_action(A, RenList ,RA) . 

One can also implement CCS-like restriction in much the same style. 



Synchronisation Operators via Unification The essence of our implemen- 
tation of the parallel composition operators can be phrased as: “synchronisation 
is unification. ” The classical definition of synchronisation in CSP-FDR is based 
on pattern matching: the synchronised values must either be ground and identi- 
cal (as in c!l ^ P |{c}| c!l ^ Q) or one value must be ground and the other a 
free variable (as in c!l ^ P |{c}| clx Q{x)). We can implement a generalisa- 
tion of this scheme, which allows variables or even partially instantiated terms 
on both sides and where synchronisation results in unification. This leads to the 
following piece of code for the generalised parallel operator: 



E.g., the expression (a->b->ST0P) [[ a<-c, a<-d ]] has the traces {c, d, c6, d&}. 
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trans(par(X,CList,Y) , io(V,Ch), par (XI ,CList , Yl) ) 

transCX, io(Vl,Ch), Xl),trans(Y, io(V2,Ch), Yl) , 
unify_values(Vl,V2,V) ,hidden(io(V,Ch) ,CList) . 
trans(par(X,CList,Y) , A, par(Xl,CList,Y) ) 

trans(X,A,Xl) ,dif (A,tick) ,not_hidden(A,CList)) . /* covers tau */ 
trans(par(X,CList,Y) , A, par(X,CList ,Y1) ) 

trans(Y,A,Yl) ,dif (A,tick) ,not_hidden(A,CList)) . /* covers tau */ 
trans(par(X,CList,Y) , tau, par(omega,CList ,Y) ) trans(X,tick,_) . 

trans(par(X,CList,Y) , tau, par (X,CList , omega) ) tr£ms(Y,tick,_) . 

trans (par (omega, CList, omega) , tick, omega ). 

The interleaving operator 1 1 1 is then defined as follows: 

trans(interleave(P,Q) ,A,R) trEuis(par(X, [] ,Y) , A,R) . 

One can also implement CCS style synchronisation: 

trans(ccs_par(X,Y) , tau, ccs_par (XI , Yl) ) trans(X, io(Vl,Ch), XI), 

tr£uis(Y,io(V2,Ch) , Yl) , ccs_unif y_values (VI , V2,_V) . 
trans(ccs_par(X,Y) , A, ccs_par (XI ,Y) ) trans(X,A,Xl) . 

trans(ccs_par(X,Y) , A, ccs_par (X,Y1) ) trans(Y,A,Yl) . 

As we will see below, this innocently looking generalisation adds a lot of 
power, resulting in a language CSP(LP) which unifies CSP-FDR, (constraint) 
logic programming and concurrent logic programming. Of course, from a seman- 
tical point of view we have to be very careful when using this extension. For 
example, when computing a trace for a CSP(LP) specification, we have to check 
that there are concrete values for the uninstantiated variables which satisfy all 
the constraints set up by the interpreter. 

3.2 The Expressive Power of CSP(LP) 

Obviously, CSP(LP) allows one to express basically all things expressible in 
CSP-FDR. In this section we illustrate the additional expressivity of CSP(LP). 

Classical Prolog in CSP(LP) We first show how one can do “classical” logic 
programming within CSP(LP), thus showing it to be a proper extension of both 
logic programming as well as of CSP-FDR. E.g. this is how one can encode the 
append and double-append predicates in CSP(LP). 

App(nil,_Z,_Z) = SKIP; 

App(cons(_H,_X) ,_Y,cons(_H,_Z)) = App(_X,_Y,_Z) ; 

Dapp(_X,_Y,_Z,_R) = App(_X,_Y,_XY) -» App(_XY,_Z,_R) ; 

MAIN = Dapp(cons(a,niI) , cons (b, nil) , cons (c , nil) , _R) -» (cas!_R -> STOP); 

The computed answer is output on channel cas. Note that sequential compo- 
sition is used to encode conjunction and SKIP is used to encode success. This 
essentially mimics Prolog left-to-right execution. More flexible co-routining can 
be encoded using the interleaving operator:^ 

^ But then, to prevent looping of the interpreter, one needs to generate a r-action for 
every unfolding (as discussed in Section 3.1). One could also add delay declarations. 
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Dapp(_X,_Y,_Z,_R) = App(_X,_Y,_XY) Ml App(_XY,_Z,_R) ; 

Constraints The following encodes the Petri net from [23] (controlling access to 
a critical section [cs] via a semaphore) and also shows how one can use constraints 
within CSP(LP). 

Petri(P,Sem,CS,Y,RC) = enter : (Sem>0 and P>0) ->Petri(P-l ,Sem-l ,CS+1 , Y,RC) ; 
Petri(P,Sem,CS,Y,RC) = exit:(CS>0) -> Petri (P ,Sem+l ,CS-1 ,Y+1 ,RC) ; 
Petri(P,Sem,CS,Y,RC) = restart : (Y>0) -> Petri (P+1 ,Sem,CS,Y-l ,RC+1) ; 

MAIN = Petri(2,l,0,0,0) ; 



Streams and Shared Memory As CSP(LP) can handle logical variables, we 
can basically use all the “tricks” of concurrent (constraint) logic programming 
languages [32] to represent asynchronous communication via streams (at the cost 
of complicating the semantical underpinning, as we have to use delay declarations 
to ensure that values are not read before they have been written). The example 
below illustrates this feature, where the process INT instantiates a stream (by 
generating ever larger numbers) and BUF reads from the stream and outputs 
the result on the channel b. 

delay BUF(X) until nonvar(X); 

BUF (nil) = STOP; 

BUF(cons(_H,_T)) = b!_H -> BUF(_T) ; 

INT(_X,cons(_X,_T)) = i ! _X -> INT(+(_X, 1) , _T) ; 

MAIN = INT(0,_S) Ml BUF(_S); 

One can also encode a limited form of (write-once read-many) shared mem- 
ory. In that case variables are similar to pointers which can be passed around 
processes. In the simple example below, three processes share the pointer X to 
a memory location. When RESET sets this location to 0 the result becomes 
immediately visible to the 2 REP processes. 

REP(X) = t.X -> REP(X) ; 

RESET (0) = reset -> STOP; 

MAIN = (REP(X) Ml REP(X) Ml RESET(X)); 

3.3 Adding Domain- Specific Features 

Inspired by [8], we have added a domain specific compensation operator cxi. In- 
tuitively, P [XI S' behaves like P until the special abort action is performed. 
At that point all the actions performed by P since the last commit action are 
compensated for in reverse order, according to the renaming specified by S. For 
example, for P = a — > 6 ^ abort STOP we have that P cxi {a ^ Ca} has the 
trace {a,b, abort,Ca)- This compensation operator was used to arrive at a suc- 
cinct specification of an electronic bookshop in Appendix A. For example, we use 
the construct Shopper(id) ixi {db.rem ^ db.add} to ensure that, if the shopper 
process Shopper (id) terminates abnormally, books are re-inserted automatically 
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into the main database db. One can encode more elaborate compensation mech- 
anisms, which, e.g., distinguish between sequential and parallel compensations 
[8], 

trans ( compensate (P .RenLst , CP) , A, compensate (PI ,RenLst ,pref ix(V,CH,CP) ) ) : - 

dif (A, tick) ,dif (A, io( [] , abort) ) ,dif (A, io( [] , commit) ) , trans (P , A,P1) , 
compensate_action(A,RenLst ,RA) , RA = io(V,CH). 
trans(compensate(P,RenLst,CP) , A, compensate (PI ,RenLst , CP) ) 

dif (A, tick) ,dif (A, io( [] , abort) ) ,dif (A, io( [] , commit) ) , trans (P , A,P1) , 
\+(compensate_action(A,RenLst , _) ) . 
trans(compensate(P,_,_) , tick, omega) trans(P,tick,_) . 
trans (compensate (P , _ ,CP) , io( [] , abort) , CP) : - trams (P , io( [] , abort) ,_) . 
trans (compensate (P .RenLst ,_) , io( [] , commit) , compensate (PI ,RenLst , skip) ) : - 

trans (P, io( [] , commit) ,P1) . 

In the context of our work with industrial partners, we have also been able to 
write an interpreter for the high-level proforma language, which is being used 
for clinical decision making. All this underlines our claim that logic program- 
ming is a very good basis to implement, and experiment with (domain specific) 
specification languages. Furthermore, once that implementation is complete, one 
can then use many existing, generic tools to obtain features such as animation, 
compilation, model checking, without further implementation effort. We demon- 
strate this in the following sections. 

4 Applying Existing Tools 

4.1 Animation 

An animator was developed using the Tcl/Tk library of SICStus Prolog 3.8.4. 
This turned out to be very straightforward, and the tool depicted in Figure 2 
was basically developed in a couple of days (and can be grafted on top of other 
interpreters). Part of the code looks like this, where translate_value converts 
terms into a form readable by Tcl/Tk: 

tcltk_get_options (Options) current_expression(CurState) , 

findalKBT, (trans (CurState ,B , _NE) ,translate_value(B ,BT) ) , Acts), 
(history( [] )-> (Options=Acts) ; (append(Acts , [’BACKTRACK’] , Options) )) . 

The tool was inspired by the ARC tool [14] for system level architecture mod- 
elling and supports (backtrackable) step-by-step animation of the CSP speci- 
fication as well as an iterative deepening search for invalid states. We plan to 
provide a graphical notation for CSP(LP) and also link the tool with the partial 
evaluators and model checkers described below. 

4.2 Compiling Using Partial Evaluation 

As our interpreter has been written purely declaratively, we can apply many 
existing specialisation and analysis tools without much hassle. 
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Fig. 2. Screenshot of the Animator for CSP(LP) 



One potential application is the compilation of CSP(LP) code into ordinary 
Prolog code using partial evaluation. For example, the following is a compilation 
of the CSP(LP) process BUF (in the context of a tracing predicate trace) from 
the previous section using the ecce online partial evaluator [22]: 

/* Specialised program generated by Ecce 1.1 */ 

/* Transformation time: 183 ms */ 

trace (agent_call(a_BUF(keep(A) )) ,B) trace 1(A,B). 

trace 1(A, [] ) . 

trace 1 ( [A I B] , [io( [A] ,b) I C] ) trace 1(B,C). 

This compilation is very satisfactory, and has completely removed the over- 
head of CSP(LP). Further work will be required to achieve efficient, predictable 
compilation for all CSP(LP) programs, e.g., by using an offline specialiser such as 
[18]. It also seems that partial evaluation could be used to compute the so-called 
symbolic operational semantics of CSP-FDR (see, e.g., [20]). 

4.3 Model Checking 

One can directly link the interpreter either with logic programming based CTL 
model checking as in [27, 6], [7], [23, 21] [25] to achieve finite state model checking 
at no extra implementation effort. 
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For instance, when combining® our CSP interpreter with the CTL model 
checker of [23], we were able to check CTL formulas such as deadlock), 
^deadlock, orestart, Oorestart for the Petri process from Section 7 (without the 
place RC to make the state space finite): 

check(F) sat(agent_call(a_MAIN) ,F) . 

I ?- check(ag(not (p(deadlock) ) ) ) . 
yes 

I ?- checkCef (p(deadlock) ) ) . 
no 

I ?- checkCef (p(io( [] .restart) ))) . 
yes 

I ?- checkCagCef (p(io( [] .restart) )))) . 
yes 



This task again required very little effort, and it would have been much 
more work to generate, e.g., a Promela encoding of CSP(LP) for use with model 
checker spin. In future work, we aim to combine our interpreter directly with an 
infinite state model checker, as described in [23], thus providing more powerful 
verification and hopefully limiting the need for users to manually abstract the 
system to be analysed. 

5 Related Work 

We are not the first to realise the potential of logic programming for animat- 
ing and/or implementing high-level specification languages. See for example [3], 
where an animator for VERILOG is developed in Prolog, or [2] where Petri 
nets are mapped to CLP. Also, the model checking system XMC contains an 
interpreter for value-passing CCS [27, 6]. Note, however, that CCS [24] uses a 
more low-level communication mechanism than CSP and that we provide a much 
richer specification language with sophisticated datatypes, functions, etc. 

The idea of automatically generating compilers from interpreters by partial 
evaluation is not new either. Its potential for domain specific languages has been 
identified more recently (e.g., [5]) but has not yet made a big impact in the logic 
programming area. A similar approach is advocated in [11] to systematically 
generate correct compilers from denotational semantics specifications in logic 
programming. This approach was applied in [19] to verify an Ada implementa- 
tion of the“ Bay Area Rapid Transit” controller. The approach in [11, 19] has 
its root in denotational semantics, while we focus on operational semantics with 
associated techniques such as model checking. Also, [11, 19] tries to verify im- 
plementations while we try to apply our techniques earlier in the software cycle, 
to specifications expressed in a new higher-level language (possibly with domain 
specific features) but whose verification is (arguably) more tractable than for 
the final implementation. 

® For this we had to slightly adapt the interpreter, as the CTL checker currently runs 
only in XSB Prolog. 
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Also, extending logic programming for concurrency is not new. Many concur- 
rent (constraint) logic programming languages have been developed (see, e.g., 
[32]). Compared to these languages, we have added synchronous message passing 
communication via named channels, as well as many CSP-specific operators and 
features (such as the distinction between internal and external choice). A partic- 
ular, more recent language worth mentioning is Oz [33, 12], which also integrates 
(amongst others) concurrency with logic programming. All of the above are real 
programming languages, whereas we are interested in a specification language 
suitable to rigorous mathematical inspection and verification. In other words, 
we want to be able to reason about distributed and concurrent systems, and not 
necessarily develop efficient distributed programs. 
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A Simple E-Bookshop in CSP(LP) 

— a generic database 
agent DB (multiset) : {db}; 

DB(nil) = db ! empty?_pid -> DB(nil); 

DB(_State) = db?member ._x?_pid: (_x in _State) -> DB(_State); 

DB(_State) = db?add?_x?_pid -> DB(cons(_x,_State)) ; 

DB(_State) = db?rem?_x?_pid: _x in _State -> DB(rem(_State,_x)) ; 
DB(_State) = db?nexists?_x?_pid: not(_x in _State) -> DB(_State) ; 

— a basket is a database 
agent NewBasket : {basket}; 

NewBasket = DB(nil) [[ {db <- basket} ]]; 

— a shopper process with its own basket 
agent NewShopper (integer) : {pay.db}; 

NewShopper (_id) = 

(( (Shopper (_id) [| {basket} |] NewBasket) \\ {basket} ) 

<> {db . rem<-db . add} ) \\ {abort , commit} 

->> end! _id->STDP; 

agent Shopper (integer) : {pa, db, basket , checkout , quit}; 

Shopper (_id) = db ! member ! _x ! _id -> 

( (db ! rem. _x I _id -> basket I add. _x ! _id -> Shopper (_id)) [] 

(db Inexists . _x I _id -> Shopper (_id) ) ); 

Shopper (_id) = checkout !_id -> Payer (_id); 

Shopper (_id) = quit!_id -> abort -> SKIP; 

agent Payer (integer) : {pay ,db, basket}; 

Payer (_id) = 

((pay!_id?ok -> commit -> SKIP) [] 

(pay!_id?ko -> db!add!_x -> abort -> SKIP) ); 

Payer (_id) = basket?empty -> SKIP; 

— a simple credit card agency 

agent CreditCardAgency (multiset) : {pay}; 

CreditCardAgency (_DB) = pay?_id! ok: (_id in _DB) -> CreditCardAgency (_DB) ; 
CreditCardAgency (_DB) = 

pay?_idlko:not(_id in _DB) -> CreditCardAgency (_DB) ; 

— a test system with 2 shoppers auid 2 books 
agent MAIN : {db,pay}; 

MAIN = (DB(cons (the_bible , cons(das_kapitaI ,nil) ) ) 

[| {db} |] 

( NewShopper (1) I I I NewShopper (2) 

)) [I {pay} I] CreditCardAgency(cons(l ,nil) ) ; 
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Abstract. Functional Reactive Programming (FRP) is a declarative 
programming model for constructing interactive applications based on 
a continuous model of time. FRP programs are described in terms of 
behaviors (continuous, time- varying, reactive values), and events (condi- 
tions that occur at discrete points in time). 

This paper presents Frappe, an implementation of FRP in the Java 
progamming language. The primary contribution of Frappe is its integra- 
tion of the FRP event /behavior model with the Java Beans event /proper- 
ty model. At the interface level, any Java Beans component may be used 
as a source or sink for the FRP event and behavior combinators. This 
provides a mechanism for extending Frappe with new kinds of I/O con- 
nections and allows FRP to be used as a high-level declarative model for 
composing applications from Java Beans components. At the implemen- 
tation level, the Java Beans event model is used internally by Frappe 
to propagate FRP events and changes to FRP behaviors. This allows 
Frappe applications to be packaged as Java Beans components for use 
in other applications, and yields an implementation of FRP well-suited 
to the requirements of event-driven applications (such as graphical user 
interfaces). 



1 Introduction 

Recent work in in the functional programming community has proposed Func- 
tional Reactive Programming (FRP) as a declarative programming model for 
constructing interactive applications. FRP programs are described in terms of be- 
haviors (continuous, time- varying, reactive values), and events (conditions that 
occur at discrete points in time). 

All previous implementations of FRP have been embedded in the Haskell 
programming language [15]. As discussed in [10], Haskell’s lazy evaluation, rich 

* This material is based upon work supported under a National Science Foundation 
Graduate Research Fellowship. Any opinions, hndings, conclusions or recommenda- 
tions expressed in this publication are those of the author and do not necessarily 
reflect the views of the National Science Foundation. 
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type system, and higher-order functions make it an excellent basis for develop- 
ment of new domain-specific languages and new programming paradigms such 
as FRP. One recent Haskell implementation of FRP that has received particular 
scrutiny is “SOE” FRP, described in [11].^ In this paper, all Haskell examples 
are given for SOE FRP. 

In the Java community, recent work has produced the Java Beans component 
model [2]. The Java Beans component model prescribes a set of programming 
conventions for writing re-usable software components. A programmer writes a 
Java Beans component by defining a Java class that specifies a set of events (“in- 
teresting” conditions which result in notifying other objects of their occurrence) 
and properties (named mutable attributes of the component that may be read 
or written with appropriate methods). 

The FRP and Java Beans programming models have very different goals 
and appear, at first glance, to be completely unrelated. The goal of FRP is to 
enable the programmer to write concise descriptions of interactive applications 
in a declarative modeling style, whereas the goal of Java Beans is to provide 
a component framework for visual builder tools. However, the two models also 
have some alluring similarities: both have a notion of events, and both have a 
notion of values that change over time (behaviors in FRP, properties in Java 
Beans). Our primary motivation for developing Frappe was to to explore the 
relationship between the two models. 

This paper presents Frappe, an implementation of FRP in Java. Our im- 
plementation is based on a correspondence between the FRP and Java Beans 
programming models, and our implementation integrates the two models very 
closely. There are two aspects to this integration: First, any Java Beans com- 
ponent may be used as a source or sink for the FRP event and behavior com- 
binators. Second, the Java Beans event model is used internally by Frappe for 
propagation of FRP events and changes to FRP behaviors. Allowing any Java 
Beans component to be used as a source or sink for the FRP event and behavior 
combinators allows the Java programmer to use FRP as a high-level declara- 
tive model for composing interactive applications from Java Beans components, 
and allows Frappe to be extended with new kinds of I/O connections without 
modifying the Frappe implementation. Using the the Java Beans event model 
internally allows Java Beans components connected by FRP combinators to be 
packaged as larger Java Beans components for use by other Beans-aware Java 
tools, and yields a “push” model for propagation of behavior and event values 
that is well-suited to the requirements of graphical user interfaces. 

The remainder of this paper is organized as follows. Section 2 gives a brief 
review of the FRP and Java Beans models. Section 3 describes the Frappe library 
interface and how the library is used to construct applications from Java Beans 
components and Frappe classes. Section 4 describes the implementation of FRP 
in Frappe. Section 5 summarizes the status of the implementation. Section 6 
discusses some limitations of our implementation. Section 7 describes related 

^ So-named because the implementation is described in the textbook “The Haskell 
School Of Expression” . 
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work. Section 8 summarizes our contributions, and briefly discusses some open 
questions and plans for future work. 

2 Preliminaries 

2.1 Functional Reactive Programming 

Functional Reactive Programming (FRP) is a high-level declarative program- 
ming model for constructing interactive applications. In this section, we give a 
very brief introduction to the aspects of FRP needed to understand the rest of 
the paper; see [6, 4, 11] for more details. 

There are two key polymorphic data types in FRP: Behavior and Event. 
Conceptually, a Behavior is a time-varying continuous value. One can think of 
type Behavior a as having the Haskell definition: 

type Behavior a = Time -> a 

That is, a value of type Behavior a is a function from Time to a. Given this 
definition, we can think of sampling a behavior as simply applying the behavior 
to some sample time. The simplest examples of behaviors are constant behav- 
iors: those that ignore their time argument and evaluate to some constant value. 
For example, constB red has type Behavior Color. It evaluates to red regard- 
less of the sample time. An example of a time-varying behavior (taken from a 
binding of FRP for computer animation [6]) is mouse (of type Behavior Point). 
When sampled, mouse yields a representation of the mouse position at the given 
sample time. Sampling the mouse at different times may yield a different Point 
depending on whether the user has moved the mouse. 

Conceptually, an Event is some condition that occurs at a discrete point 
in time. In Haskell, we write the type Event a for an event source capable of 
producing a sequence of occurrences, where each occurrence carries a value of 
type a. For example: 

Ibp : : Event 0 
key : : Event Char 

declare the types of two primitive event sources defined in the FRP binding for 
computer animation. The first event source, Ibp, generates an event occurrence 
every time the left mouse button is pressed. Each occurrence carries a value of 
type O (read “unit”), meaning that there is no data carried with this event 
other than the fact that it occurred. The second event source, key, generates an 
event occurrence every time a key is pressed on the keyboard. Each occurrence 
carries a value of type Char representing the key that was pressed. 

An implementation of FRP provides the programmer with a set of primi- 
tive behaviors and event sources, and a library of combinators for creating new 
behaviors and event sources from existing ones. For example, the expression: 



Ibp -=> red 
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uses the -=> combinator (of type Event a -> b -> Event b) to produce an event 
source of type Event Color. The event occurs whenever Ibp occurs (i.e. every 
time the left mouse button is pressed), but each occurrence carries the value red. 
More complex event sources and behaviors are produced by nested applications 
of combinators. For example, the expression: 

(Ibp -=> red . I . rbp -=> blue) 

uses the merge operator ( . I . ) to produce an an event source (of type Event 
Color) that occurs whenever the left or right mouse button is pressed. When the 
left button is pressed, an occurrence is generated carrying the value red; when 
the right button is pressed, an occurrence is generated carrying the value blue. 

The FRP model defines a combinator, switcher, for converting an event 
source to a behavior. The type of switcher is given as: 

switcher : : Behavior a -> Event (Behavior a) -> Behavior a 

Informally, switcher produces a behavior that initially follows its first argu- 
ment. However, every time an event occurs on the event source given as the 
second argument, switcher “switches” to follow the behavior carried in that 
event occurrence. For example: 

c = switcher red (Ibp -=> red . I . rbp -=> blue) 

uses switcher to define c as a behavior with type Behavior Color^. When the 
application starts, c will initially be red. When the left mouse button is pressed, 
c changes to red, and when the right mouse button is pressed, c changes to blue. 
This is an example of a reactive behavior - it changes in response to a user input 
event . 

A binding of FRP for a particular problem domain will usually define a type 
for a top-level behavior that represents the output of the application. A complete 
application is written by by using the FRP combinators to define an expression 
for a behavior of this type, and passing this value to a display function. For ex- 
ample, in computer animation, the output of an application is of type Behavior 
Picture. An example, then, of a complete FRP application for computer anima- 
tion is: 

exampleApp = withColor c circle 

where c = switcher red (Ibp -=> red . I . rbp -=> blue) 

main = animate exampleApp 

this application renders a circle of unit size in an output window. The circle will 
be initially red, but the color will change between red and blue as the left and 
right mouse button are pressed. 

^ Here we use implicit lifting of constants to behaviors. Strictly speaking, we should 
have written constB red instead of just red, but the Haskell implementations of 
FRP use instance declarations to perform this translation automatically. 
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2.2 Java Beans 

This section gives a brief summary of the Java Beans programming model. For a 
more complete account, the reader is referred to the Java Beans Specification [2]. 



What Is a Bean? The Java Beans Specification defines a Java Bean informally 
as “a reusable software component that can be manipulated visually in a builder 
tool” For example, all of the Swing user interface components (buttons, sliders, 
windows, etc.) are Beans. However, Beans need not have any visual appearance 
at run-time. For example, a software component that takes a ticker symbol as 
input and periodically delivers stock quotes for the ticker symbol could easily be 
packaged as a Bean. 

More concretely, then, a Bean is a Java class that conforms to certain pro- 
gramming conventions prescribed by the Java Beans specification. These con- 
ventions specify that a Bean should make its functionality available to clients 
through: 

— Properties-Tmitable named attributes of the object that can be read or writ- 
ten by calling appropriate accessor methods. 

— Events-A named set of “interesting” things that may happen to the Bean 
instance. Clients may register to be notified when an event occurs by imple- 
menting a listener interface. When the event occurs, the component notifies 
the client by invoking a method defined in the listener interface. 

— Methods-Or dvaaiy Java methods, invoked for their side-effects on the Bean 
instance or its environment. 

The Beans model does not require a separate interface definition language for 
specifying the interface to a Java Beans component. Instead, the Beans model 
prescribes that a Java Bean can be written as an ordinary Java class in the Java 
language, and can be packaged for use by a builder tool simply by compiling 
the source file to the standard Java class file format. The “builder tools” use 
reflection [13] on the class file to recover information about the features exported 
by the particular Bean, and the standard Java library provides a set of helper 
classes in the java. beans package for use by builder tools. These helper classes 
perform the low-level reflection operations to recover information about events, 
properties and methods supported by a Bean. 

A full discussion of the programming conventions used to define Java Beans is 
outside the scope of this paper. However, to give a basic feel for the programming 
conventions, we will show how a Java class is defined to export a set of properties, 
thus making the class a Java Bean. 

Properties are accessed by means of accessor methods, which are used to 
read and write values of the property. A property may be readable or writable 
(or both), which determines whether the property supports get or set accessor 
methods. The convention for get accessor methods is: 

® In the remainder this paper, we use the terms “Java Beans component” and “Bean” 
interchangeably. 
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public PropertyType get Property N ame (.) ; 
and the convention for set accessor methods is: 

public void setPropertyName(,PropertyType arg) ; 

where PropertyName is the (appropriately capitalized) name of the property, 
and PropertyType is the Java type of the property. 

For example, the class JComponent of Java Swing defines get and set accessors 
for its width and height properties as: 

public class JComponent { 

public int getWidthO ; 
public void setWidth(int arg) ; 
public int getHeightO; 
public void setHeight (int arg); 

} 



Bound Properties A particularly important aspect of the Java Beans speci- 
fication is its provision for bound properties. A component author may specify 
that a property is a hound property, meaning that interested clients can register 
to be notified whenever the property’s value changes. A property’s value might 
change either as a direct result of the application program invoking a set accessor 
method, or as an indirect result of some user action. For example, a text entry 
widget might have a text property representing the text entered into the field by 
the user. If implemented as a bound property, an application could register to 
be notified whenever the user changed the contents of the text entry component. 
Bound properties play a critical role in the implementation of FRP behaviors in 
Frappe. 

3 Frappe - A User’s Perspective 

Frappe is implemented as a Java library organized around two Java interfaces: 
Behavior and FRPEvent Source. These interfaces correspond to the parameterized 
types Behavior and Event in Haskell implementations of FRP. 

The combinators in core FRP are implemented as concrete classes in Frappe. 
Each such class provides a constructor that takes the same number and type 
of arguments as the combinator’s Haskell counterpart, and the class implements 
either the Behavior or FRPEvent Source interface in accordance with the result type 
of the particular combinator. For example, the switcher combinator introduced 
in section 2 is realized as the following Java class: 

public class Switcher implements Behavior ■[ 
public Switcher (Scheduler sched, 

Behavior bv, 

FRPEventSource evSource) 

} 
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The first argument to the constructor is a global scheduling context used by the 
implementation. The programmer obtains such a context once during initializa- 
tion, and simply passes it to all constructors of Frappe classes. The next two 
arguments correspond to the arguments of the switcher combinator: the behav- 
ior to follow initially, and an event source whose occurences carry a behavior to 
follow after the occurence. 

3.1 A Simple Example 

To write programs in Frappe, the programmer simply instantiates a number 
of Java Beans components and connects those components together using the 
Frappe classes corresponding to FRP combinators. The program then relin- 
quishes control to the Java runtime library’s main event loop. 

As a concrete example, consider writing a a program to display a red circle 
on the screen that tracks the current mouse position. In SOE FRP, this would 
be written: 

ball = stretch 0.3 (withColor red circle) 

Euiim = (lift2 move) (p2v mouseB) (constB ball) 

main = animate anim 

The first line here defines ball as a static picture of a red circle located at the 
origin scaled by a factor of 0.3. The second line applies the lifting combinator 
lift2 to the function move. The lifting combinators convert functions over static 
values to functions over behaviors, so if we have a function 
f : : a -> b -> c 

then the lifted version of this function is: 

(lift2 f) : : Behavior a -> Behavior b -> Behavior c 

Here is an equivalent code fragment that implements this same example in 
Frappe"^: 

Drawable circle = new ShapeDrawable (new Ellipse2D .Double(-l ,-l ,2 ,2) ) ; 

Drawable ball = circle .withColor (Color . red) . stretch(0 . 3) ; 

Behavior mouse = FRPUtilities .makeBehavior (sched, frame, "mouse"); 

Behavior anim = FRPUtilities . liftMethod(sched, new ConstB (ball) , 

"move", new Behavior [] ■[ mouse }■) ; 



f ranimator . setImageB (anim) ; 



^ There is also a small amount of standard “boilerplate” code required to wrap this 
code in a Java class, instantiate the top-level window in which the animation is 
displayed, and initialize Frappe. We have elided this extraneous code due to space 
limitations, but a complete version of this example (and many others) is available in 
the Frappe distribution from the Frappe web site. 
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First, we define circle as a ShapeDrawable constructed from a Java2D Ellipse. 
Drawable is an abstract class provided by Frappe to simplify animation program- 
ming. The concrete class ShapeDrawable is a Drawable capable of rendering any 
Shape from the Java2D API [9]. The methods provided by Drawable are based on 
the low-level graphics model of Fran [6] and provide support for scaling, trans- 
lating or changing the color of a Drawable as well as overlaying two Drawables 
to form a composite Drawable. We use two of these methods of circle to define 
ball as a scaled, red circle. 

Next we define mouse using FRPUtilities .makeBehavior. This static method 
creates a Behavior from a bound property of a Java Bean instance. In this case, 
the variable frame is an instance of the class FrcUiFrame. A FranFrame is a top-level 
window (specifically, a specialization of Swing’s JFrame class) for displaying an 
animation. FranFrame is a Java Bean that provides a bound property “mouse”. 
At any point in time, the value of this property is the current mouse position 
within the window. This definition binds the variable mouse to a behavior that, 
when sampled, will return the value of frame. getMouseO at the time of sam- 
pling. The property name is passed as a string to makeBehavior () because the 
implementation uses Java reflection [13] on the class instance to look up the 
appropriate accessor method and to register for notification when the property 
changes. 

Next we use FRPUtilities.liftMethod to construct the animation. This static 
method is a lifting combinator similar to the liftA combinators provided in 
Haskell implementations of FRP. The second argument to liftMethodO is a 
behavior whose sample values implement the given method. In this example, we 
pass new ConstB(ball) as the second argument. This is a new constant behavior 
whose value at every sample point is ball. Since ball is an instance of Drawable, 
it supports Drawable’s moveO method, defined in Drawable as: 

public abstract class Drawable { 

/** return a new Drawable that is a translated version of this */ 
public Drawable move(Point2D pos) ; 



} 

By using liftMethod, this method is lifted to operate on behaviors rather than 
static values: the method is applied pointwise to the values of the target instance 
behavior and argument behaviors at every sample time. In this case, the target 
instance is a constant behavior, and the argument is the behavior variable mouse 
that yields the current mouse position as a Point2D at every sample time. The 
result will be a Behavior whose value at every sample time is a Drawable that 
renders a scaled red circle moved to the current mouse position. 

Finally, we pass this behavior (anim) to the franimator component’s 
setImageBO method to actually display the animation on the screen. The vari- 
able freuiimator is an instance of the Franimator class obtained from frame, and 
is a specialization of the Swing JPanel component for displaying animations. 

It is interesting to compare the Haskell version of this example, the Frappe 
version, and to consider what a corresponding version of this example would 
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look like in pure Java / Swing without Frappe. The Frappe version introduces 
considerable notational overhead relative to the original Haskell version, but 
much of this stems from Java’s explicit type declarations, verbose method / 
class names, and lack of name overloading (a la Haskell type classes). 

Space considerations prevent us from including complete source for a pure 
Java / Swing version of this example, but we can outline the basic approach. A 
pure Java / Swing version would have to implement a MouseMotionListener to 
handle mouseMoved events, and register this listener instance with the JPanel in 
which the circle is displayed. The programmer would then have to implement 
the handler to extract the (x,y) position from the MouseMotionEvent, and use 
this value to set the (x, y) position of the circle on every event occurence. 

We have found Frappe to be more concise than Java alone for many of our 
example programs. In this example, the primary savings comes from using com- 
binator classes instead of trivial listener instances to do simple forms of event 
propagation. Perhaps more importantly than the savings in code size, we feel 
the Frappe version is conceptually cleaner than a corresponding pure Java/Swing 
version. Instead of writing a plethora of event handlers that perform imperative 
actions to change the program state, the Frappe programmer simply writes a set 
of time-invariant, side- effect free definitions describing how the different com- 
ponents of the program are connected together. 

3.2 Using Java Beans Events 

Just as Java Beans properties may be converted to behaviors by a call to 
FRPUtilities.makeBehaviorO, Java Beans events may be converted to FRP 
Event Sources by a call to FRPUtilities.makeFRPEventO. For example, the fol- 
lowing sets the variable IbpEventSource to an FRPEventSource that has an event 
occurrence every time the “Ibp” event occurs on frame: 

FRPEventSource IbpEventSource = 

FRPUtilities .makeFRPEvent (sched, frame, "franMouse" , "Ibp") ; 

The second argument is the Bean instance whose event occurrences are being 
observed. The third argument is the name of the “event set” of interest. This 
determines the listener type that is used to register for event notifications. The 
fourth argument is the name of the specific notification method of interest within 
the listener type. In this example, franMouse is an event set that identifies the 
FranMouseListener event listener class, and Ibp is the name of the particular 
method invoked on a FranMouseListener instance when the left mouse button is 
pressed. 

3.3 Using Java Beans Properties as Output Sinks 

We have seen that Java Beans properties can be used as inputs to Frappe’s 
combinator classes. On the output side, we can also connect any behavior to a 
writable property of some Bean, using a BehaviorConnector. For example, the 
following creates a BehaviorConnector that will connect strB (a String-valued 
behavior) to a JLabel’s “text” property: 
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JLabel label = ... 

Behavior strB = ... 

new BehaviorConnector (sched, strB, label, "text"); 

This BehaviorConnector will invoke label . setText () every time the value of strB 
changes value. 

3.4 Encoding Recursive Definitions 

In the Haskell implementations of FRP, many programs rely on Haskell’s lazy 
evaluation to write mutually recursive behaviors and events. For example: 

sharp2 = when (time >* 1) 
sharps = when spike 

spike = (constB False) ‘switcher' ((sharp2 -=> (constB True)) .|. 

(sharps -=> (constB False))) 

The above defines spike as a Behavior Bool that is momentarily True at some 
point at time t = 1 + e (for some e ) and False everywhere else. 

To encode this example in Frappe, we must perform the cyclic wiring explic- 
itly. To support this, Frappe defines an extra constructor for the Switcher class 
that takes only a scheduling context. A call to this constructor returns an unini- 
tialized Switcher instance. Recall that the switcher combinator takes two argu- 
ments (a target behavior and an event source) that are usually passed in explic- 
itly to the Switcher constructor. If this alternative constructor is used, then the 
programmer must make explicit calls to the bindTargetO and bindEventSourceO 
methods before running the resulting behavior. The above example is coded in 
Frappe as: 

Switcher spike = new Switcher(sched) ; // uninitialized instance! 

Behavior gtOneB = // Frappe encoding of gtOneB = time >* (constB 1) 
FRPUtilities . lift (sched, this . getClass () , 

"gtOne", new Behavior!] { time }) ; 

FRPEventSource sharp2 = new When(sched, gtOneB) ; 

FRPEventSource sharpS = new When(sched, spike) ; 

spike. bindTarget (new ConstB (Boolean. FALSE) ) ; 
spike. bindEventSource (new EventMerge(sched, ...)); 

Note that the reference to the uninitialized instance spike is used in the definition 
of sharps, but the appropriate calls are made to spike. bindTargetO and 
spike. bindEventSourceO during initialization. 




Frappe: Functional Reactive Programming in Java 



39 



4 Implementing FRP in Java 

4.1 Behaviors 

Like other Haskell implementations of FRP, Frappe represents the FRP program 
as a graph structure at runtime. We achieve this by defining a Java class for each 
FRP combinator. Each node in the combinator graph is represented by an object 
instance at runtime, and each edge is represented by a field with a reference to 
another instance of a combinator class. 

What operations must each behavior node in the runtime graph support? A 
detailed study of the SOE FRP implementation revealed that, in the absence 
of generalized time transformation, each node in the graph essentially needs to 
support only one operation: get the value of the behavior at the current sample 
time. 

Interestingly, we can model this operation by defining a behavior as a Java 
Bean with a single bound property. This leads to the Java encoding illustrated 
in figure 1. While the syntax is somewhat verbose, this can be read simply as 
“Every behavior is a Bean that provides a bound property named value.'" 

public interface Behavior { 

/** Accessor to read the current value of this Behavior */ 
public Object getValueO; 

/** register a PropertyChangeListener */ 

public void addPropertyChangeListener (PropertyChcUigeListener 1) ; 

/** Remove a PropertyChangeListener from the list of listeners. */ 
public void removePropertyChangeListener (PropertyChangeListener 1); 

} 



Fig. 1. Java encoding of Behaviors 



Individual behavior objects might be connected as inputs to other nodes in 
the combinator graph, and those nodes will need to be informed when a behav- 
ior’s value has changed. Hence, we make value a bound property, so that other 
nodes can register for a Property ChangeEvent when the value of the behav- 
ior changes. Our implementation uses such events to propagate behavior values 
through the system. 

An implementation of the Behavior interface supports registration of listen- 
ers, as required of bound properties. All output connections for a node are stored 
in this listener list. If an FRP combinator class uses a Behavior as one of its in- 
puts, the combinator class must implement the PropertyChamgeListener interface 
in order to be notified of changes in its input behavior. 

Since the Haskell definition of behavior is a polymorphic type, we declare 
the return type of getValueO as Object. The value returned must be converted 
to an instance of the appropriate type using a cast, and the cast is checked at 
runtime. 
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4.2 Events 

We implement FRP Events by mapping the FRP notion of “event” directly to 
a Bean Event named FRPEvent. 

The class FRPEvent represents a single event occurrence. It extends 
java. util. EventObject, as required by the Java Beans specification. The 
FRPEvent Source interface is implemented by every class that generates FRP 
Events. This interface corresponds directly with the Haskell type Event a that 
identifies an event source in the SOE implementation. The methods defined in 
FRPEvent Source are those prescribed by the Java Beans conventions for register- 
ing event listeners. This interface declaration can be read as stating that “Every 
FRP Event Source is a Bean event source for the event named FRPEvent'' 

The FRPEventListener interface is implemented by any class that wishes to 
be notified when an FRPEvent occurs on some source. A listener is registered 
with the event source by passing a reference to the listener to the source’s 
addFRPEventListener () method. Then, at some point later when the event oc- 
curs, the event source will notify all registered listeners by invoking each listener’s 
event DccurredO method, passing it an FRPEvent instance representing the event 
occurrence. 

4.3 Propagation of Event and Behavior Values 

Propagation of event and behavior values in Frappe is purely event-driven. To 
execute an FRP specification, a user program simply constructs an explicit graph 
of FRP combinators {initialization), and relinquishes control to the main event 
loop in the Java runtime library. When there is input to the application (for 
example, when the user presses a mouse button), the Java runtime will invoke 
an event handler of some object in the Frappe implementation that implements 
a primitive FRP event source or behavior. This primitive event handler, in turn, 
will invoke the appropriate event handler of each registered listener: 

— For an event source, each event listener implements the FRPEventListener 
interface. The listener’s eventOccuredO method is invoked to propagate the 
event occurrence. 

— For a behavior, each event listener implements the PropertyChangeListener 
interface. The listener’s propertyChangedO method is invoked to propagate 
the change in the behavior’s value. 

Each registered listener for a primitive event or behavior is an FRP combinator. 
The combinator’s event handler will compute any changes to its output and 
invoke an event handler method on each of its registered listeners. Propagation 
of events continues in this way until some “output” listener is reached. 

4.4 Where Did the Time Go? 

In the SOE implementation of FRP, every behavior combinator has implicit ac- 
cess to an input stream of user actions and sample times. However, in the absence 
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of time transformations, stream transformers that implement combinators only 
use this input stream to extract the current logical time. 

Like other FRP implementations, Frappe provides user programs with a 
Behavior that represents the current time: 

public class Animator •[ 

public Behavior getXimelnstanceO ; 

} 

The behavior returned from getTimelnstanceO is equivalent to the global be- 
havior time in other Haskell-based implementations. This primitive behavior 
is implemented by a software timer (our implementation uses an instance of 
j avax . swing . Timer internally) . The implementation maintains an internal float- 
ing point value that represents the current time (in seconds) since the application 
started. 

Our treatment of time differs from the previous Haskell-based implementa- 
tions in that combinators that need access to the time instance must have the 
time instance passed in explicitly. Requiring that time be passed explicitly en- 
ables an important optimization: Frappe can eliminate the “sampling overhead” 
of the SOE implementation. In the SOE implementation, the implementation 
of every combinator has implicit access to the current time. An unfortunate 
consequence of this design is that the FRP implementation must maintain a 
“heartbeat” , pulling one value from the output stream at some high sample rate 
(typically 20-30 samples per second), in case some combinator implementation 
has an implicit time dependency. In contrast, the Frappe implementation only 
allows access to the current sample time by explicitly registering a listener with 
the behavior returned from getTimelnstanceO. If this behavior has no registered 
listeners, then it is safe to disable the interval timer used by the implementation 
and allow the Java runtime’s event loop to block waiting for other input events. 



5 Status and Availability 

The complete Frappe distribution is available from the Frappe web site at 
http://www.haskell.org/frappe under the terms of the GNU General Public 
License. We have working implementations of all of the core combinators and 
examples given in [14], using the encoding of behaviors and events presented here. 
For the most part, the implementation of these combinators is a straightforward 
translation of the formal definition into the Java language using the types and 
propagation model presented here. 

6 Limitations 



Because Java lacks a polymorphic type system, and because our implementa- 
tion makes extensive use of Java reflection, our implementation of FRP is not 
statically type-safe. In this respect, Frappe is no better and no worse than many 
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other Java libraries, such as the Java collection classes. Nevertheless, it would be 
an interesting exercise to rewrite Frappe using GJ [1]. An alternative approach 
(that we are pursuing) is to use Frappe as a compilation target for some other 
high-level Haskell-like FRP notation. In this case, the front-end translator can 
perform polymorphic type-checking statically, and generate Frappe code that is 
guaranteed not to have type errors at runtime. 

Frappe assumes that event processing is single-threaded and synchronous. 
That is, all primitive Java Beans events used as event or behavior sources for 
Frappe must be fired from the system’s event dispatching thread, and each event 
must completely propagate through the FRP combinator graph before the next 
event is handled. This single-threaded, synchronous event processing model is 
also required by Java Swing, and Frappe does not impose any further restrictions 
than those already required for event handling in Swing. 

Like the stream-based implementation from which it derives, our implementa- 
tion of FRP is unable to detect instantaneous predicate events. An instantaneous 
predicate event is one that happens only at some specific instantaneous point in 
time. For example: 

sharp : : Event 0 
sharp = when (time==*l) 

is only true instantaneously at time=l. An event such as sharp can not be 
detected simply by monotonic sampling; accurate detection of predicate events 
requires interval analysis, as discussed in [6, 3]. In many ways, the inability to 
detect instantaneous predicate events is similar to the problem of comparing two 
floating point numbers for equality using ==, lifted to the time domain. 

Finally, Frappe does not support generalized time transformations. 



7 Related Work 

Elliott [3] has done much of the pioneering work on implementations of the FRP 
model in Haskell, and reported on the design tradeoffs of various implementation 
strategies. Hudak [11] provides a completely annotated description of a stream- 
based implementation of FRP from which our implementation is derived. 

Recent work in the functional programming community has produced ways 
to make component objects and library code written in imperative languages 
available from Haskell [8, 7, 12]. Our work and this previous work share the 
common goal of providing programmers with a declarative model for connect- 
ing component objects written in imperative languages. However, our approach 
can be viewed as the “inverse” of these efforts: instead of embedding calls to 
component objects written in an imperative language into a declarative pro- 
gramming model, Frappe takes a declarative programming model and embeds it 
in an imperative language that supports component objects. 

Elliott’s work on declarative event- oriented programming [5] showed that 
FRP’s event model (implemented in Fran) could be used to compose interactive 
event-driven user interfaces in a declarative style, and compared this to the con- 
ventional imperative approaches for programming user interfaces. FranTk [16] 
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is a complete binding of the FRP programming model to the Tk user interface 
toolkit. FranTk demonstrates the viability of using FRP for user interfaces, and 
inspired us to explore how we might adapt the FRP model for use with the Java 
Swing toolkit. 

8 Conclusions and Future Work 

We have presented an implementation of FRP in the Java programming lan- 
guage. The most significant aspect of our implementation is that it is based on 
a close correspondence between the FRP event /behavior model and the Java 
Beans event/property model. 

One of the unique aspects of Frappe is its ability to use Java Beans compo- 
nents as sources or sinks for FRP combinators. In principle there is no reason why 
this features needs to be limited to our Java implementation of FRP. It would 
be interesting to explore adding a similar feature to one of the Haskell-based 
implementations of FRP using COM objects as components. 

Our experience with Frappe to date is limited, but promising. As discussed 
in section 3, Frappe programs compare favorably with similar programs written 
in pure Java / Swing, both in terms of program size and conceptual clarity. 
Nevertheless, Frappe is still exceedingly verbose when compared with Haskell- 
based implementations of FRP. To correct this deficiency, and to provide for 
static type checking of Frappe programs, we are currently developing a translator 
that compiles a Haskell-like FRP notation to the corresponding Frappe code. A 
prototype of this translator already works for some small examples, and the 
results appear promising. 
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Abstract. Subject directories and search engines are the most com- 
monly used tools to search for information on the World Wide Web. 
We propose a novel notion of subject meta-directory on top of standard 
subject directories, which strongly resembles the way in which meta- 
search engines extend standard search engines. After analyzing the main 
choices arising in the design of subject meta-directories, we describe the 
implementation of WebGate, a subject meta-directory entirely written 
in Prolog. 

Keywords: Prolog, Web-programming, subject directories. 



1 Introduction 

The Internet is undoubtedly the largest source of information available nowa- 
days. Indeed the World Wide Web (WWW) contains an incomparable amount 
of heterogeneous information presented in heterogeneous formats, such as text, 
images or music. The terrific and constant expansion of the WWW makes it dif- 
ficult even to evaluate the number of documents available. Recent reports [4, 13] 
estimate the existence of more than 800 millions of indexable documents, and 
conjecture that such number will at least double every year. 

As the amount of available information grows, the search for specific infor- 
mation becomes a critical issue of primary importance. According to [4], about 
70% of Web users declare that information search is their primary usage of the 
WWW, and that the inability to find desired information is the second most 
frustrating aspect of the WWW (after the slowness of response). 

Search engines and subject directories are the most commonly used tools to 
search for information on the World Wide Web. However, as we shall discuss in 
Sect. 2, both search engines and subject directories present a number of limi- 
tations which affect their effectiveness. So-called meta-search engines have been 
introduced to overcome some of the limitations of search engines. Simply stated, 
a meta-search engine accepts term-based queries from the user, forwards them 
(in parallel) to a set of available search engines, combines the obtained results, 
and presents them to the user. Meta-search engines are parasitic services that 
rely on available search engines and use a minimal amount of local resources in 
order to provide a better coverage of the WWW. 

In this paper we propose a novel notion of subject meta-directory, a service 
that parasitically relies on and combines available subject directories, much in 
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the way meta-search engines extend search engines. We will show that, in spite 
of the above similarity, the design of a subject meta-directory is quite more 
complex than the design of a meta-search engine, both from the conceptual and 
from the implementation viewpoint. 

We will then describe the design and implementation of WebGate, a sub- 
ject meta-directory entirely implemented in Prolog. Indeed the choice of using 
declarative programming for implementing WebGate has revealed to pay off. 
Programming by logically formulating executable specifications notably eases 
the development of simple and concise programming solutions. The resulting 
code is simple, easy to understand, to debug and to maintain. We believe that 
the development of WebGate contributes to demonstrating that declarative pro- 
gramming can be fruitfully applied to develop Web-based applications. Compat- 
ibly with space limits, we will try to comment some illustrative parts of the code 
to highlight the adequacy of high-level declarative programming for developing 
Web-based applications. 

The rest of the paper is organized as follows. Section 2 briefly describes the 
state-of-art of search engines, subject directories and meta-search engines for the 
WWW. A general definition of subject meta-directory is given is Sect. 3, while 
the design and implementation of WebGate are discussed in Sect. 4. Finally 
Sect. 5 contains some concluding remarks. 



2 Search Engines, Directories, and Meta-search Engines 



Search engines index huge amounts of documents and support (syntactic) term- 
based searches. AltaVista [1] and HotBot [10] are two well-known examples of 
search engines. Subject directories classify Web sites in terms of categories or- 
ganized in a hierarchical structure (catalog) . Each category contains, in general, 
internal links to other (sub-)categories and external links classified to belong to 
such category. Users may perform term-based searches on the catalog as well 
as directly browse the categories. Yahoo! [23], Open Directory [18] and LookS- 
mart [16] are some of the best known subject directories. 

Subject directories have several advantages over search engines, mainly due 
to the fact that the classiflcation process is performed (or at least supervised) by 
human experts (the so-called editors). Perhaps the most important advantages 
of subject directories are that: (1) Search engines index pages, while directories 
classify sites and present them to the user in a structured way. As a consequence, a 
disciplined browsing of a subject directory is generally a lot less time consuming 
than analyzing a flat set of results returned by a search engine. (2) Subject 
directories ensure a high relevance of the classified information. The number of 
false hits (links to non-existing or non-relevant documents) is much lower when 
working with a directory rather than with a search engine. 

On the other hand, subject directories present typical limitations of human- 
based services, in particular for the relatively small amount of classified infor- 
mation and for the slowness of the updating process. 
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It is fair to observe that the choice between using search engines or subject 
directories however depends on the specific needs of the user. Moreover, many 
search engines include a catalog, and many subject directories feature the pos- 
sibility of performing searches both inside the catalog and in the external Web. 

One of the weakest aspects of both search engines and directories is their 
scarce coverage of the WWW. Existing search engines individually cover at most 
16% of the estimated total number of pages [13] and such percentage is even lower 
in the case of subject directories because of their manual compilation. Moreover, 
such percentages are decreasing with time [4, 17], thus showing an apparent 
incapacity of these tools to cope with the terrific growth of the WWW. 

A better coverage of the WWW can be obtained by observing that the overlap 
between the set of documents indexed by different services is quite low. A suitable 
combination of such services may therefore sensibly enhance the quality of the 
service offered. For instance an ideal combination of the most prominent search 
engines may lead to a coverage of 42% of all indexed documents [13]. 

So-called meta-search engines have been developed to overcome (some of) 
the limitations of search engines. A meta-search engine features a user interface 
that is similar to the user interface of a standard search engine. Rather than 
performing searches on locally built indexes, meta-search engines directly send 
the user query to several search engines, and then collect and display the obtained 
results. Meta-search engines may improve the quality of service of existing search 
engines [12]. A natural question is how to define a corresponding notion of subject 
meta- directory on top of existing subject directories. This is the scope of the 
following section. 

3 Subject Meta-directories 

The term “meta-directory” is currently used to denote gateways to large col- 
lections of subject oriented material [14]. More precisely, meta-directories list 
specialty directories and sites by subject, often providing forms for entering 
search terms directly. An example of a currently available meta-directory is for 
instance The BigHub.com [21]. 

We argue that the added value of meta-directories w.r.t. object directories 
should be much the same as the added value of meta-search engines w.r.t. object 
search engines. Namely, users should be able to explore a meta-directory as 
if it were a standard object directory, the underlying combination of object 
directories being transparent to them. In our view, a meta-directory is hence a 
directory which: 

1. provides the information available in n existing object directories, 

2. provides a uniform presentation of such information according to a single 
reference hierarchy, and 

3. (most importantly) makes the combination of the object directories trans- 
parent to the user. 

Clearly the first aim of a meta-directory is to provide access to a set of classified 
information larger than what is already available on a single directory. A crucial 
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aspect of a meta-directory is however to provide a uniform presentation of such 
information, which is organized by means of different subject hierarchies in dif- 
ferent directories. Indeed this second aspect makes the design of meta-directories 
quite more complex than the design of meta-search engines, and it introduces 
several new design issues: 

— Re-classification. Directories contain classified information, and a proper in- 
tegration of such information requires a (re-)classification mechanism. (We 
will further discuss re-classification later.) 

— Reference hierarchy. There are basically two possibilities when choosing the 
reference hierarchy for the meta-directory: (a) Build a new taxonomy from 
scratch, general enough to include the taxonomies of all the object directo- 
ries, or (b) inherit the taxonomy of one of the object directories. The latter 
solution is preferable as it is more realistic and it does not introduce the 
need of maintaining a hierarchy (besides saving the need of re-classifying 
one object directory). 

— Link-to-category vs. category-to-category association. The above mentioned 
re-classification process is needed to associate the information present in the 
available directories with the hierarchy of the meta-directory. Such an associ- 
ation may be performed either by associating individual links to categories or 
by associating categories to categories. We will discuss later how the second 
approach adheres better to the parasitic nature of meta-directories. 

Let us now introduce some terminology that will be used in the rest of the paper. 
A meta-directory is represented by a hierarchical structure {meta- catalog) con- 
sisting of meta-categories. Each meta-category contains the information classified 
in k object categories of n existing object directories. If the category-to-category 
association is adopted, then a meta-directory can be simply represented by an 
association table defining the content of meta-categories, as illustrated in Fig. 1. 
Finally, it is worth mentioning two important implementation choices which arise 
in the design of a meta-directory: 

— On-line vs. off-line re- classification. Re-classification is a time consuming 
process. To retain efficiency, the re-classification process must be performed 
off-line so as to create beforehand all the information necessary to the system 
to run the user interface. 

— Local information. The issue here is what information on the object direc- 
tories should be kept locally by the meta-directory. Intuitively speaking, the 
problem is to find a convenient trade-off between efficiency and the ability 
to deal with dynamic changes in the object directories. 

4 Design and Implementation of WebGate 

4.1 General Architecture 

We shall now present the software architecture of the meta-directory WebGate. 
The system consists of two main modules (Merge and Interface) and of a local 
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Meta-directory Association table 




Fig. 1. Structure of a meta-directory. 



database. The latter plays a crucial role in the system, as it contains the relation 
between categories which defines the logical content of the meta-directory. The 
overall structure of WebGate is reported in Fig. 2. 

The Merge module builds the internal database by systematically visiting and 
classifying the object directories. More precisely, Merge classifies each category 
C of each object directory in order to find the most appropriate association 
(C, D) where D is a meta-category. Such associations are stored in the database 
for their future use by the Interface module. 

The Interface module implements the user services (browse and search) by 
exploiting the information produced by Merge and stored in the local database. 
The behavior of Interface resembles the behavior of meta-search engines. For 
each user request Interface performs a number of accesses to remote directories 
to get the information needed to satisfy the user request. Then such information 
is elaborated and the result presented to the user. (The main differences w.r.t. 
a meta-search engine are the role of the local database and the treatment of 
duplicated information.) 

It is worth noting that Merge does not have to operate necessarily before 
Interface starts. Indeed if the two modules are working simultaneously then the 
system will answer to the user on the basis of the partial information available, 
as it normally happens in subject directories when editors update the directory 
while users are browsing it. 

WebGate has been entirely developed in SICStus Prolog [11], using the Pillow 
library [6] for Web interaction. The local database is implemented with the 
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Fig. 2. A bird-eye view of WebGate. 



mySQL DBMS [3]. The interaction between Prolog and mySQL is supported 
by a Prolog module that we have developed by exploiting the C client library 
available in the mySQL distribution. 

4.2 Merge Module 

The Merge module combines two mechanisms: spidering and classification. 

Spidering is necessary to visit the object directories in order to identify the 
“relevant” categories they contain. A category C of an object directory X is 
relevant for the construction of the meta-directory only if C contains some links 
to sites external to X. Object directories are visited in a tree-like fashion (by 
ignoring so-called “alias” references). 

The classification mechanism aims at associating each category C of each 
object directory (where C is individuated by the spidering) with one meta- 
category. The choice of building category-to-category associations (rather than 
link-to-category associations) is coherent with the parasitic nature of a meta- 
directory, and presents two main advantages. 

(i) Minimization of local resources: The table containing the set of associations 
(object-category, meta-category) is the only local information necessary to 
the meta-directory. No information on the content of each object-category 
is stored locally (namely the links contained in the object category). Such 
information will be downloaded only when needed, for instance when the 
user will click on a meta-category to browse its contents. 

(ii) Flexibility: Dynamic insertions and deletions of links in object directories 
do not outdate meta-directories employing a category-to-category associa- 
tion table. In contrast meta-directories based on link-to-category association 
are not able to deal with such updates without performing re-classification. 
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(Subject directories are however constantly evolving both in content and in 
structure. The merge process of a meta-directory must therefore be period- 
ically performed.) 

The choice of building category-to-category associations is certainly advanta- 
geous from the implementation viewpoint, although it may be sometimes too 
coarse-grained in terms of the overall classification quality. Indeed the category- 
to-category association forces the association of all links of an object category 
with a single meta-category, general enough to contain all of them. On the other 
hand, a more accurate classification could fragment the links of the object cat- 
egory into several sets and perform a finer-grained association. 



Classification Mechanism. Before describing the internal architecture of the 
Merge module, we briefly introduce the classification mechanism used by Merge 
to (possibly) associate each object category with a meta-category. 

Such an association process is performed by integrating two types of classifi- 
cation: A nome-&osed classification and a contenT&ased classification. Informally, 
the former classifies an object category in terms of its name — which is actually 
a synthesis of the logical path characterizing it in its source directory. The sec- 
ond form of classification instead analyzes (part of) the information contained 
in the category. 

According to the parasitic nature of meta-directories. Merge implements both 
types of classifications by exploiting as much as possible the services featured 
by object directories. Existing subject directories support term-based searches 
within their catalog. The results of such searches typically include categories 
as well as links associated with title, a short description and the category they 
belong to. The availability of such on-line search facilities in the reference di- 
rectory can be hence exploited for performing both classifications in a parasitic 
way. Indeed both types of classification share the following general behavior: 

(1) Given an object category C, determine a set Tq of terms representing C. 

(2) Search a subset of the terms Tc in the reference catalog to get a set M of 
candidate meta-categories. 

(3) Analyze the set of candidates M to determine whether there is a meta- 
category D G M which can be associated with C or not. 

A subset of the terms Tc determined at step (1) is OR-combined at step (2) to 
determine the set of candidate meta-categories. The evaluation of the obtained 
results will be done in step (3). Step (3) compares fragments of text that are 
representatives of different categories, by exploiting well-established IR tech- 
niques [22]. The ultimate objective of step (3) is to determine the candidate 
category with the highest similarity coefficient. 

Notice that WebGate exploits two different types of term-based searches sup- 
ported by (most) existing subject directories: category search for name-based 
classification and site search for content-based classification. It is important to 
observe that the results returned by the search engine of the reference directory 
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are typically ordered according to relevance criteria. And such criteria take into 
account only the terms used in the search. We will therefore take the first n re- 
sults as candidate meta-categories — where n is a system parameter determined 
by the accuracy degree of the meta-catalog to be built. 

We now briefly describe the two types of classification, by commenting in 
particular on how steps (1), (2) and (3) are performed. 



Name-Based Classification. Simply stated, name-based classification verifies 
the existence of a meta-category D whose name is similar enough to the name 
of the object category C. 

(1) The set of terms Tq coincides with name{C), the set of representative 
terms extracted from the path denoting C in its directory. For instance the 
name associated with the category path Computers : Programming: Component 
Frameworks is {Computers, Programming, Component, Frameworks}. 

(2) Each term in Tq is important, so all of them are employed in the search. 

(3) The choice of the best candidate from a set M of meta-categories is made 
by comparing the set of terms name{C) with the set of terms name{X) for 
each meta-category X in M . The degree of similarity of two set of terms X 
and Y is evaluated according to Dice’s coefficient [22]: s(A, Y) = 2- 

Name-based classification is quite effective when the object catalog and the ref- 
erence catalog are similar both in their structure and in the category names 
used, at least for some parts of their catalog. 

Content-Based Classification. Simply stated, content-based classification 
verifies the existence of a meta-category D whose content is similar enough 
to the content of the object category C. 

(1) The content of a category C is represented by name{C) along with the terms 
occurring in the title or description of any (external) link contained in C. 

(2) The set of terms Tq may be too large for a full search, hence a subset of 
it must be chosen. This subset contains name{C) and the k most “repre- 
sentative” terms in Tc \ name{C). The representativeness of a term t for a 
category C depends on the number of different links of C containing t and 
on the overall number of occurrences of t in C . 

(3) The choice of the best candidate from a set M of meta-categories is per- 
formed by comparing the set of terms Tc with the set of terms Tx for each 
meta-category X in M . Such a content-based comparison is performed ac- 
cording to the vector space model [7, 20], intuitively by representing each 
category with a vector of weighted terms. 

It is worth noting that our content-based classification technique bears some 
similarities with the classification- by-context technique proposed in [2], where 
the text surrounding a hyper-link is used to classify the link itself. Our classifi- 
cation instead analyzes the text describing the links of a category to classify the 
overall content of the category. 
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It is also worth observing that our content-based classification does not ana- 
lyze all links of the candidate meta-categories. Indeed the links of each candidate 
meta-category that are considered at step (3) are those links returned by the 
term-based search performed at the previous step. 



Implementation of Merge. We now briefly describe the internal architecture 
of the Merge module. As illustrated in Fig. 3, the two activities of spidering 
and classification are neatly separated and they are performed by specialized 
processes called spiders and classifiers, respectively. 

Spiders and classifiers cooperate and interact via a shared tuple space (TS) 
in the style of the Linda [9] coordination model, supported by SICStus Prolog. 
The construction of a meta-directory requires many network operations for ac- 
cessing the object directories. To improve efficiency, both the spidering and the 
classification activity have been hence implemented by means of a set of paral- 
lel processes so as to diminish the delays and to better exploit the connection 
bandwidth by performing more accesses at a time. 



Directories 




Fig. 3. Internal architecture of module Merge. 



Spiders explore the object directories and pass to the classifiers the object 
categories to be reclassified. Each spider consumes a tuple representing an object 
category C to be visited, accesses such category, and then (possibly) produces 
a number of tuples representing new (sub-)categories to be visited and a tuple 
stating that category C has to be classified. Each classifier instead consumes a 
tuple representing a category to be classified, tries to classify it and (possibly) 
updates the category-to-category association table stored in the local database. 
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Spider Processes. The behavior of a generic spider is a failure-controlled cycle 
over the following predicate: 

spider in_tuple(cat(Dir,RelUrl)) , 

category_url (Dir , RelUrl , AbsUrl) , 
get_page(AbsUrl,Page) , 
get_path(Dir, Page, Path) , 
get_links (Dir , Page , Categories , Links) , 
add_tuples (Categories) , 
add_tuple (Path, Links) . 

At each iteration, the spider removes from the tuple space a tuple representing 
an object category to be visited. Such tuples have the form cat (Dir, RelUrl), 
where Dir is the name of the object directory and RelUrl the relative URL of the 
category. The removal of the tuple is performed by predicate in_tuple/l which 
extends Linda’s predicate in/1 to check the possible termination of the spidering 
task^. Given the name of the directory and the relative URL of the category, 
predicate category _url/3 determines the absolute URL of the page containing 
the category. Such page is accessed and converted into a list of terms by pred- 
icate get_page/2, which is defined in terms of Pillow’s predicates fetch_url/3 e 
html2terms/2. Predicate get_page/2 implements also a minimal error handling 
to decide whether to retry accessing the page in case of errors. Predicates 
get_path/3 and get_links/4 analyze the list of terms representing the page con- 
tent in order to extract from them the logical path of the category (Path), and 
the sub-categories (Categories) and links (Links) contained therein. Predicate 
add_tuples/l adds to the tuple space a tuple of the form cat (Dir, RelUrl) for each 
sub-category found during the analysis of the page. If the category contains some 
external link then predicate add_tuple/2 generates a temporary file containing 
the information needed for the classification process (i.e., the logical path and 
the links), and it also generates a tuple of the form links (Dir , RelUrl, File) to 
notify the classifiers that there is a new category to be classified. 

Classifier Processes. The role of classifier processes is to classify the object 
categories found by the spiders. As for spider processes, the behaviors of clas- 
sifiers is defined by means of a failure-controlled loop iterating the following 
predicate: 

classifier (Ref Dir , Quality) : - in_tuple (links (SrcDir ,SrcUrl,File) ) , 

f ind_best (Quality , File , Ref Dir , Ref Url , Rate) , 
create_assoc (Ref Url , SrcDir , SrcUrl , Rate) , 
delete_f ile(File) . 

The first argument RefDir is the name of the reference directory^, while Quality 
is the accuracy degree requested for the classification (low/normal/high). Predi- 
cate in_tuple/l extracts a tuple of the form links(Dir, RelUrl, File) available in 

^ In such case the loop is broken via a Prolog failure. 

^ The parameterization of the name of the reference directory is for the sake of gener- 
ality of the system. Currently WebGate uses only Yahoo! as reference directory. 
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the tuple space, and it controls the possible termination of the classification task, 
similarly to the spider process previously described. Predicate f ind_best/5 deter- 
mines the best (i.e., the most similar) category in the reference catalog and re- 
turns its relative URL (RefUrl) along with the confidence level of the association 
(Rate). The result of the classification is passed to predicate create_assoc/4 which 
adds the new association to the association table stored in the local database. 

Predicate find_best/5 implements the two-level classification mechanism de- 
scribed in the previous section. Name-based classification is tried first. If it does 
not succeed, then content-based classification is performed. The executable spec- 
ification of the above described classification process is expressed by the following 
clauses: 

f ind_best (Quality , File , Ref Dir , RefUrl , Sim) : - 

classif ication(by_name , Quality , File , Ref Dir , RefUrl , Sim) . 
f ind_best (Quality , File , Ref Dir .RefUrl , Sim) : - 

\+ ( classification(by_name,Quality,File,RefDir, RefUrl, Sim) ), 
classif icat ion (by_content , Quality .File , Ref Dir , RefUrl , Sim) . 
f ind_best (Quality .File .Ref Dir .nil ,0) : - 

\+ ( classification(by_name,Quality,File,RefDir, RefUrl, Sim) ), 

\+ ( classification(by_content, Quality, File, RefDir, RefUrl, Sim) ). 



classif icat ion (Type , Quality .File .RefDir .RefUrl , Sim) : - 
classif y (Type , Quality .File , RefDir , RefUrl , Sim) , 
threshold (Type, Quality,!) , 

Sim > T. 



Notice that a classification may fail because the confidence level of the associ- 
ation determined is too low. To this end, classification/6 employs thresholds 
to discard unsatisfactory classifications. It is worth noting that the choice of 
the Quality parameter does affect the efficiency of the entire classification pro- 
cess. If high accuracy is requested then content-based classification will be most 
probably triggered, and the corresponding efficiency cost will be paid. 

Both classification techniques rely on common operations such as extract- 
ing terms from HTML text fragments, formulating (term-based) queries to the 
reference catalog, and computing similarities. 

Text analysis is performed assuming English as the default natural language. 
Terms are filtered so as to eliminate all the members of a pre-fixed set of stop- 
words, that are irrelevant for classification purposes (e.g., articles or pronouns). 
To this end, WebGate refers to the list of stop-words available as a Prolog 
database at [15]. Such a parser can be naturally implemented in Prolog by means 
of a Definite Clause Grammar (DCG) as illustrated by the following code frag- 
ment: 



terms (Terms-Termsl) — > 
separators , 
terms (Terms-Termsl) . 
terms (Terms-Termsl) — > 
term(Term) , 

{ stop_word(Term) I, 
terms (Terms-Termsl) . 



terms ( [Term I Terms] -Termsl) — > 
term(Term) , 

•[ \+(stop_word(Term) ) }, 
terms (Terms-Termsl) . 
terms (Terms-Terms) — > [] . 
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Predicate terms/3 extracts the relevant terms from a text and returns them as 
a difference list, in order to support the efficient concatenation of lists extracted 
from different text fragments. 

Term-based search is performed by suitably invoking the underlying CGI 
script in the remote reference directory. The needed query format can be de- 
tected by examining the URL of the pages containing the returned search re- 
sults. Advanced search queries in Yahoo! (the reference catalog for WehGate) 
have the format: http : / / search . yahoo . com/ search?p= . . . &o= . . . &h= . . . &n= . . . 

where search parameters are separated by & and include the search terms p 
(terms are concatenated by +), the operator o for combining search terms, the 
type of search h (for categories or for links), and the number n of results to be 
returned per page. 

Finally, the computation of the similarity between two sets of terms is defined 
by a predicate terms_sim/4, which is parametric w.r.t. the similarity coefficient to 
be used. Such predicate is defined in terms of distributions (viz., term-frequency 
tables) rather than in terms of plain lists in order to uniformly treat different 
kinds of functions and to make the system easy to modify. 

Both types of classifications share the same general structure, as summarized 
by predicate classify/6: 

classif y (ClassType , Quality , File , Ref Dir , Ref Url , Sim) : - 
rep_terms (ClassType , File .Terms) , 

get_candidates (ClassType , Quality .Terms . Ref Dir . Cands) . 
best_candidate (ClassType . Quality .Terms .Cands .Ref Url . Sim) . 

Predicate rep_terms/3 extracts from the temporary File all the terms required 
for the classification, by exploiting the parser illustrated above. Candidate cat- 
egories are determined by predicate get_candidates/5, which performs a term- 
based search in the reference catalog: 

get_candidates (ClassType . Quality .Terms . Ref Dir . Cands) : - 
search_terms (Quality . ClassType . Terms . Keys) . 
result_number (Quality. N) . 
build_query (ClassType . Ref Dir . Keys . N .Url) . 
url_access(Url.Page) . 

get_results (ClassType . Ref Dir .Page . Cands) . 

Predicates search_terms/4 and result_number/2 determine, respectively, the set 
Keys of terms to be used in the query and the number of results to be returned by 
the reference directory. Predicate build_query/5 (parametric w.r.t the reference 
catalog) arranges such parameters in a query according to the URL format pre- 
viously discussed. Search results are extracted by predicate get_results/4 which 
analyses the obtained HTML page in the same fashion as predicate get_links/4 
encountered in the spiders’ description. Finally, the best candidate is chosen by 
predicate best_candidate/6: 

best_candidate (ClassType . Quality .Terms .Canids .Ref Url , Sim) : - 
terms_to_dist (stems . Terms . Stems) . 
eval_res (ClassType . Results . Stems . Sims) . 
best_res(Sims. (Ref Url : Sim) ) . 
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Predicate eval_res/5 computes the similarity of each candidate w.r.t the object 
category, by comparing the corresponding set of terms via the similarity func- 
tion of the chosen classification technique. Distributions are built out of lists of 
terms by predicate terms_to_dist/3, whose first argument specifies here to apply 
stemming to terms (using Porter algorithm [19]). For the sake of efficiency the 
distribution of the object category is computed outside the predicate eval_res/5, 
while the distributions of the candidate categories are built inside it. 

The comparison results are returned as a list of pairs (Candidate, similarity^, 
which is examined by predicate best_res/2 in order to find the one with the 
highest similarity value. 



4.3 Interface Module 

As illustrated in Fig. 2, WehGate is composed of two main modules (Merge and 
Interface) and of a local database containing the category-to-category associ- 
ations which defines the logical structure of the meta-directory. The Interface 
module is in charge of making the system available on the WWW via HTML 
pages which users can access by means of standard browsers. According to the 
definition of meta-directory given in Sect. 3, users must be able to explore the 
meta-directory as if it were a standard directory, the underlying combination of 
object directories being transparent to them. 

Users access the information available in standard directories either by brows- 
ing over the catalog or by performing term-based searches. These two modalities 
require very similar operations from the viewpoint of the meta-directory. In both 
cases, the system must perform one or more (parallel) accesses to remote object 
directories, suitably combine the obtained results, and present them in a uniform 
way to the user. Such a behavior can be naturally implemented by structuring 
the Interface module into two sub-modules, one supporting catalog browsing 
and the other supporting term-based searching. Both sub-modules are parallel 
(to support multiple simultaneous accesses) and are implemented^ by means of 
CGI scripts written in Prolog and invoked inside the pages of the meta-directory. 
The execution of such scripts is supported by the shell psh {Prolog SHell) which 
we have developed for this purpose. 

The script in charge of the browsing is invoked every time the user selects 
a meta-category from the pages of the meta-directory. The script receives the 
physical path of the omonym category in the reference directory and builds 
on-the-fly the HTML page describing the requested meta-category. To do this, 
the script downloads information from the reference directory (links and sub- 
categories) and from other object directories (just links), by exploiting the asso- 
ciation table contained in the local database. The most interesting aspects of the 
process are the policies employed for removing duplicated links (while merging 
their descriptions) and for the multi-page visualization of the links contained 
in a meta-category (as in standard search engines). Regrettably, lack of space 
prevents us from discussing these issues in greater detail. 

® The current version of WehGate supports only the browsing of the meta-directory. 
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5 Concluding Remarks 

The present paper contains two main contributions: 

(1) The introduction of a novel notion of subject meta- directory, a service pre- 
senting in a uniform way the information available in several object directo- 
ries while making their combination transparent to the user. In this respect, 
we argue (see Sect. 3) that available “meta-directories” are intended to be 
meta-suhject directories rather than subject meta- directories. 

(2) The description of the design and implementation of WebGate, a subject 
meta-directory entirely developed in Prolog. We have experimented that the 
choice of using declarative programming has notably eased the development 
of the system. Programming by logically formulating executable specifica- 
tions has led to concise code which is easy to understand, debug and main- 
tain. 

The idea of employing a reference hierarchy and an association table to combine 
separate object directories into a subject meta-directory somehow resembles the 
idea of using a structured universal relation and compatibility rules in the design 
of webbases [8]. 

Generally speaking, we believe that the development of WebGate contributes 
to demonstrating that declarative programming can be fruitfully applied to de- 
velop Web-based applications. It is perhaps worth making a remark on efficiency 
here. Logic programming languages have the reputation of not being enough ef- 
ficient. While advances in this direction are sometimes perhaps under-estimated, 
the impact of Prolog implementation efficiency is indeed quite limited for sys- 
tems that massively perform networking operations. Indeed, for systems like Pro- 
logCrawler [5] (a meta-search engine entirely written in Prolog), the time needed 
for performing networking operations is by far larger than the time needed for 
performing Prolog computations. Moreover in the case of WebGate the most 
expensive process is the construction of the meta-catalog, which is performed 
off-line by the Merge module. 

The current version^ of WebGate integrates two object directories: Yahoo! 
(which is taken as reference directory) and OpenDirectory. We plan to extend 
the set of object directories so as to further improve the coverage of the system. 
Notably, as described in Sect. 4, such an extension requires a trivial extension of 
the system in order to trigger the spidering over the new catalogs. Future work 
will be also devoted to complete the development of the Interface module so as 
to support the search facility over the meta-directory. This second extension, 
though not trivial as the former, can be however implemented at low cost as 
pointed out in Sect. 4. 



Available at: http://venus.di.unipi.it/WebGate/ 
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Abstract. In this paper we show how an agent programming language, 
based on a formal theory of actions, can be employed to implement adap- 
tive web applications, where a personalized dynamical site generation is 
guided by the user’s needs. For this purpose, we have developed an on-line 
compnter seller in DyLOG, a modal logic programming language which 
allows one to specify agents acting, interacting, and planning in dynamic 
environments. Adaptation at the navigation level is realized by dynam- 
ically building a presentation plan for solving the problem to assemble 
a computer, being driven by goals generated by interacting with the 
user. The planning capabilities of DyLOG are exploited to implement the 
automated generation of a presentation plan to achieve the goals. The 
DyLOG agent is the “reasoning” component of a larger system, called 
WLog, which is described in this paper. 



1 Introduction and Related Works 

Recent years witnessed a rapid expansion of the use of multimedia technologies, 
the web in particular, for the most various purposes: advertisement, information, 
communication, commerce and distribution of services are just some examples. 
One problem that arises because of the wide variety of users of such tools is to 
find a way for adapting the presentation to the particular user. Many of the most 
advanced solutions [20, 18, 1, 9, 10] start from the assumption that adaptation 
should focus on the user’s own characteristics, thus, though in different ways, 
they all try to associate him/her with a reference prototype (also known as 
the “user model”); the presentation is then adapted to user prototypes. The 
association between the user and a model is done either a priori, by asking the 
user to fill a form, or little by little by inducing preferences and interests from the 
user’s choices. In some cases, such as in the SeTA project [2], a hybrid solution 
is adopted where, first, an a priori model is built and, then, it is refined by 
exploiting the user’s choices and selections. 

In our view, such solutions lack one important feature: the representation of 
the user’s “goals” (or intentions), which may change at every connection. Let us 
suppose, for example, that I access to an on-line newspaper for some time, always 
searching for sport news. One day, after I have heard about a terrible accident 
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in a foreign country, I access to the newspaper for getting more information. If 
the system has, meanwhile, induced that my only interest is sport, I could have 
difficulties in getting the information I am interested in or have it presented at 
a too shallow level of detail with respect to my current interest. This kind of 
inconvenient would not have happened if the system I interacted with tried to 
capture my goal for the current connection and only after it tried to adapt the 
presentation to me (not limiting its reasoning to my generic interests only). 



Seller (asking): 
Client: 

Seller (thinking): 



Seller (asking): 
Client: 

Seller (asking): 
Client: 

Seller (thinking): 



Seller (asking): 
Client: 

Seller (informing): 



-What do you need the computer for? 

-I would use it for multimedia purposes. 

-Well, let me think, he needs a configuration with a huge monitor, 
any kind of RAM, any kind of CPU, a sound-card, and a modem. 
But he may have some of these components. Let’s list them. 

-Do you already have any of the listed components? 

-Yes, I have a fast CPU and a sound-card. 

-Do you have a limited budget? 

-Yes, 650 Euro. 

-He needs a monitor, RAM and a modem. I need to plan according 
to his needs, budget and eurrent availability of the components in 
the store. 

-I have different configurations that satisfy your needs. Let me ask 
first which of the listed RAMs you prefer. 

-I prefer to have 128MB of RAM. 

I propose you this configuration. Do you like it? 



Table 1. Example of dialogue between a client and a seller 



Our research aims at addressing this deficit, building a web system that, 
during the interaction with the user, adopts the user’s goals in order to achieve a 
more complete adaptation. More technically, we study the implementation of web 
sites which are “structureless” , depending their shape on the goals that the single 
users have. The structure of the web site depends on the interaction between 
the user and a server-side agent system. The notion of computational agent is 
central in artificial intelligence (AI), because it supplies a powerful abstraction 
tool for characterizing complex systems, situated in dynamic environments, by 
using mentalistic notions. In this perspective, we describe a system in terms of 
its beliefs about the world, its goals, and its capabilities of acting; the system 
must be able to autonomously plan and execute action sequences for achieving 
its purposes. When a user connects to a site managed by one of our agents, 
(s)he does not access to a fixed graph of pages and links but interacts with a 
program which, starting from a knowledge base specific to the site and from 
the requests of the user, builds an ad hoc structure. This is a difference with 
respect to current dynamic web sites, where pages are dynamically constructed 
but not the site structure. In our approach, such a structure corresponds to a 
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plan aimed at pursuing the user’s goals. So, the user’s goal is the bias that orients 
the presentation of domain-dependent information (text, images, videos and so 
forth). This approach is “orthogonal” to the one based on user models, which is 
already widely studied in the literature. 

Our approach could also recall Natural Language cooperative dialogue sys- 
tems but there are some differences. For instance, in [8] a logic of rational in- 
teraction is proposed for implementing the dialogue management components 
of a spoken dialogue system. This work is based (like ours, as we will see) on 
dynamic logic and reasoning capabilities on actions and intentions are exploited 
to plan dialogue acts. Our focus, however, is not on recognizing or inferring 
user’s intentions. User’s needs are taken as input by the software agent, that 
uses them to generate the goals that will drive its behaviour. The novelty of our 
approach stands in exploiting planning capabilities not for dialogue act planning 
but for building web site presentation plans guided by the user’s goal. In this 
perspective the structure of the site is built by the system as a conditional plan, 
and according to the initial user’s needs and constraints. The execution of the 
plan by the system corresponds to the navigation of the site, where the user and 
the system cooperate for building the configuration satisfying the user’s needs. 
Indeed, during the execution the choice between the branches of the conditional 
plan is determined by means of the interaction with the user. 

The language that we used for writing the server-side agent program is Dy- 
LOG [7, 5, 4]. DyLOG is based on a logical theory for reasoning about actions and 
change in a logic programming setting. It allows one to specify an agent’s behav- 
ior by defining both a set of simple actions that the agent can perform (they are 
defined in terms of their preconditions and effects) and a set of complex actions 
(procedures), built upon simple ones. DyLOG has been implemented in Sicstus 
Prolog; a specification in DyLOG can be executed by an interpreter, which is a 
straightforward implementation of the proof procedure of the language. 

This language is particularly interesting for agent programming, and in par- 
ticular for our application, for two of its main characteristics. The first is that, 
being based on a formal theory of actions, it can deal with reasoning about 
action effects in a dynamically changing environment and, as such, it supports 
planning. Reasoning about the effect of actions in a dynamically changing world 
is one of the main problems that must be faced by intelligent agents, even in the 
case in which we consider the internal dynamics of the agent itself, i.e. the way it 
updates its beliefs and its goals, that can be regarded as the result of the execu- 
tion of actions on its mental state. The second is that the logical characterization 
of the language is very close to the procedural one, and this allows to reduce the 
gap between theory and practical use. In the literature, other languages (e.g. 
GOLOG [16]) have been developed for reasoning in dynamic domains and for 
agent programming. However, there was an advantage in using DyLOG in the 
current work: it has a sound proof procedure, which practically allows to deal 
with the planning task in presence of sensing. Indeed conditional plans (not only 
linear plans) can be automatically extracted, as we better explain in Section 3.4. 
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To summarize, in this work we use DyLOG for building cognitive agents that 
autonomously reason on their own behavior in order to obtain web site adapta- 
tion at the navigation level, i.e. to dynamically generate a site being guided by 
the user’s goals; such goals are explicitly adopted by the system throughout the 
entire interaction. In this domain, one of the most important aspects is to define 
the navigation possibilities available to the user and to determine which page to 
display, based on the dynamics of the interaction. The system that we realized 
and that uses the DyLOG agent is called WLog. 

The approach that we propose brings along various innovations. From a 
human-machine interaction perspective, the user will not have to fill forms where 
pieces of information which (s)he does not feel as useful to explore the site are 
requested (for instance, his/her education). Moreover, the system will not re- 
strict its answers to a user model which is either fixed or past-oriented; other 
advantages are expected on the web site build-modify-update process. In order 
to modify a classical web site, one has to change the contents of the pages and 
the links between them. In our case, the site does not exist as a given structure, 
there exist data contained in a data base, whose maintenance is much simpler, 
and a program (the agent’s program), which is likely to be changed very rarely, 
since most of the changes are related to the data and to the structure by which 
they are presented, not in the way this structure is built. Last but not least, this 
approach allows a fast prototyping of sites as well as it allows the validation of 
how the information is presented. 



2 A Case Study 

The application that we will use as a case study deals with the construction 
of a virtual computer seller. Computer assembly is a good application domain 
because the world of hardware components is rapidly and continuously evolving 
so that, on one hand, it will be very expensive to keep a more classical (static) 
web site up-to-date; on the other hand, it is unlikely that clients are equally 
up-to-date and know the technical characteristics or even just the names of 
processors, motherboards, etc. It will also allow comparison with the literature 
because this domain has been used in other works, such as [17]. 

Furthermore, what a computer buyer wants, what (s)he often only knows, 
is what (s)he needs the computer for. Sometimes the desired use belongs to 
a category (e.g. world processing or internet browsing), sometimes it is more 
peculiar and maybe related to a specific job (e.g. use of CAD systems for design). 
In a real shop the choice would be taken thanks to a dialogue between the 
client and the seller (see Table 1 for an example), dialogue in which the latter 
tries to understand which function the client is interested in, proposing proper 
configurations. The client, on the other hand, can either accept or refuse the 
proposals, maybe specifying further details or constraints. Every new piece of 
information will be used for converging to a proposal that the client will accept. 

In the case of on-line purchase, it is reasonable to offer a similar interaction; 
this is what our system tries to do. In this application, the seller and the client 
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have the common goal to build a computer, by joining their competences, which 
in the case of the seller is technical whereas in the case of the client is related both 
to the reasons for which (s)he is purchasing that kind of object and to his/her 
constraints (e.g. the budget). The seller leads the interaction, by applying a plan 
(see Figure 1 for an example) aimed at selling. The goal of such a plan is to find 
a configuration which satisfies the goal of the client. Observe that, although the 
final purpose is always the same, the variety of the possible situations is so wide 
that it is not convenient to build a single, general, and complete plan that can be 
used in all of the cases; on the contrary it is better to build the plan depending 
on the current conditions. In this way it is possible to develop the virtual seller 
in an incremental way, by augmenting its operational knowledge or by making 
it more sophisticate. Once a plan has been defined, the virtual seller follows it 
for making the proposals/requests that it considers as the most adequate. 



3 The Agent Programming Language 

In this section, we will recall the definition of the logic language DyLOG, refer- 
ring to the proposal in [4], and its extension to deal with complex actions and 
knowledge-producing actions [5, 7]. DyLOG is based on a modal action theory 
that has been developed in [4, 14, 15]. It provides a nonmonotonic solution to 
the frame problem by making use of abductive assumptions and it deals with the 
ramification problem by introducing a “causality” operator. In [5, 7] the action 
language has been extended to deal with complex actions and knowledge pro- 
ducing actions. The formalization of complex actions draws considerably from 
dynamic logic. As a difference, rather than referring to an Algol-like paradigm for 
describing complex actions as in GOLOG [16], it refers to a Prolog- like paradigm: 
instead of using the iteration operator, complex actions are defined through (pos- 
sibly recursive) definitions, given by means of Prolog-like clauses. The nondeter- 
ministic choice among actions is allowed by alternative clause definitions. 

The adoption of Dynamic Logic or a modal logic to deal with the problem of 
reasoning about actions and change is common to many proposals, such as for 
instance [12, 19, 11], and it is motivated by the fact that modal logic allows a very 
natural representation of actions as state transitions, through the accessibility 
relation of Kripke structures. Since the intentional notions (or attitudes), which 
are used to describe agents, are usually represented as modalities, our modal 
action theory is also well suited to incorporate such attitudes. 

In [5, 7] the planning problem has been addressed and a goal directed proof 
procedure for reasoning on complex actions and for extracting linear and con- 
ditional plans has been introduced. In [6] it has been described how to use our 
implementation of DyLOG as an agent programming language, for executing pro- 
cedures which model the behaviour of an agent, but also for reasoning about 
them, by extracting from them linear or conditional plans. 
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3.1 Primitive Actions: The Skills of the Agent 

In our action language each primitive action a G A is represented by a modality 
[a]. The meaning of the formulas [a]a is that a holds after any execution of 
action a. The meaning of the formula {a)a is that there is a possible execution 
of action a after which a holds. We also introduce a modality □, which is used to 
denote those formulas that hold in all states, that is, after any action sequence. 

A state consists of a set of fluents, i.e. properties whose truth value may 
change over the time. In general we cannot assume that the value of each fluent 
in a state is known to an agent, and we want to be able of representing the fact 
that some fluents are unknown and to reason about the execution of actions on 
incomplete states. To represent explicitly the unknown value of some fluents, in 
[7] we introduce an epistemic level in our representation language. In particular, 
we introduce an epistemic operator B, to represent the beliefs an agent has on 
the world: Bf will mean that the fluent / is known to be true, B^f will mean 
that the fluent / is known to be false. Fluent / is undefined in the case both ^Bf 
and ^B^f hold. We will write u{F) for ^Bf A ~^B^f . In our implementation 
of DyLOG (and also in the following description) we do not explicitly use the 
epistemic operator B\ if a fluent / (or its negation ^/) is present in a state, it is 
intended to be believed, unknown otherwise. Thus each fluent can have one of 
the three values: true, false or unknown. We use the notation u{F)l to test if a 
fluent is unknown (i.e. to test if neither / nor is present in the state). 

Simple action laws are rules that allow one to describe direct and indirect 
effects of primitive actions on a state. Basically, simple action clauses consist of 
action laws, precondition laws, and causal laws: 

— Action laws define direct effects of primitive actions on a fluent and al- 
low actions with conditional effects to be represented. They have the form 

□ (Fs ^ [o]F'), where a is a primitive action name, F' is a fluent, and Fs is 
a fluent conjunction, meaning that action a has effect on F, when executed 
in a state where the fluent preconditions Fs hold. 

— Precondition laws allow action preconditions, i.e. those conditions which 
make an action executable in a state, to be specified. Precondition laws 
have form □(Fs ^ (a}true), meaning that when the fluent conjunction Fs 
holds in a state, execution of the action a is possible in that state. 

— Causal laws are used to express causal dependencies among fluents and, 
then, to describe indirect effects of primitive actions. They have the form 

□ (Fs ^ F), meaning that the fluent F holds if the fluent conjunction Fs 
holds too^. 

In DyLOG we have adopted a more readable notation: action laws have the 
form “a causes F if Fs” precondition laws have the form “a possible Jf Fs” 
and causal rules have the form “F if Fs” . 

^ In a logic programming context we represent causality by the directionality of im- 
plication. A more general solution, which makes use of modality “causes” , has been 
provided in [15]. 
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As an example, one of the actions of our selling agent is the following: 



add(monitor(X)) : add the monitor X to the current configuration 



(1) add{monitor{X)) 

(2) add{monitor{X)) 

(3) add{monitor{X)) 

(4) add{monitor{X)) 



possibleJf true. 
causes has{monitor{X)). 

causes in. the shopping -cart {monitor {X)) if true. 
causes credit{B\) if get. value{X, price, P) 

Acredit(B) A {B1 is B + P). 



Rule (1) states that the action add{monitor{X)) is always executable. Action 
laws (2)-(4) describe the effects of the action’s execution: adding the monitor of 
type X causes having the monitor X in the configuration under construction (2), 
having it into the shopping cart (3), and updating the current credit by summing 
the monitor price (4). Analogously, we define the other seller’s primitive actions 
of adding to the configuration a CPU, a RAM, or a peripheral. 

In other cases, an action can be applied only if a given condition is satisfied. 
For instance, a motherboard can be added only if a CPU is already available; 
furthermore, the CPU and the motherboard must be compatible: 



add(mother(X)) : add the compatible motherboard X to the current 
configuration 

(5) add{mother{X)) possibleJf has{cpu{C)). 

(6) add{mother{generic)) causes has{mother{X)) if has{cpu{C))A 

get. mother. comp{C, X). 



Intuitively, an action can be executed in a state s if the preconditions of the 
action hold in s. The execution of the action modifies the state according to 
the action and causal laws. Furthermore we assume that the value of a fluent 
persists from one state to the next one, if the action does not cause the value of 
the fluent to change. 



3.2 The Interaction with the User: Sensing and Suggesting Actions 

The interaction of the agent with the user is modeled in our language by means 
of actions for gathering inputs from the external world. 

In [7] we studied how to represent in our framework a particular kind of infor- 
mative actions, called sensing actions, which allow an agent to gather knowledge 
from the environment about the value of a fluent F , rather than to change it. 
In DyLOG direct effects of sensing actions are represented using knowledge laws, 
that have form “s senses F” , meaning that action s causes to know whether F 
holds^. Generally, these effects are interpreted as inputs from outside that are not 
under the agent control but in a broad sense, executing a sensing action allows 
an agent to interact with the external world to determine the value of certain 
fluents. In this paper we are interested in modelling a particular kind of inter- 
action: the interaction of the agent system with the user. In this case the user is 

^ See [7] for the translation of knowledge laws in the modal language. 
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explicitly requested to enter a value for a fluent, true or false in case of ordinary 
fluents, a value from the domain in case of fluents with an associated finite do- 
main. The interaction is carried on by a sensing action. In our running example, 
for instance, we introduce the binary sensing action askJ,fJias-monitor{M), 
for knowing whether the user has already a monitor of type M : 

ask. if -has -monitor {M) possible jf u{user .has{monitor{M))) . 

ask.if -has .monitor {M) senses user .has{monitor{M)) . 

Specifically for the web application domain, we have also defined a special 
subset of sensing actions, called suggesting actions which are useful when an 
agent has to And out the value of fluents representing the user’s preferences 
among a finite subset of alternatives. The number and values of the alternatives 
depend on the particular interaction that is being carried on. Generally, the agent 
will suggest different subsets of choices to different users. When performing this 
kind of actions the agent does not read an input in a passive way, but has an 
active role in selecting (after some reasoning) the possible values among which 
the user chooses. In particular, only those values that lead to fulfill the goal 
will be selected. These are the intuitive motivations to introduce the suggesting 
actions. Formally, the difference w.r.t. normal sensing actions is that while those 
consider as alternative values for a given fluent its whole domain, suggesting 
actions allow to offer only a subset of it. 

For representing the effects of such actions we use the notation “s suggests 
F” , meaning that action s suggests a possibly selected set of values for fluent F 
and causes to know the value of F. As an example, our virtual seller can perform 
a suggesting action to offer to the user the choice among the available kinds of 
monitor: 

of fer .monitor. type possibleJf true. 

of fer .monitor. type suggests type.monitor{X). 

The range of X will be computed during the interaction with the user and will 
be a subset of the finite domain associated to type.monitor{X) . 

3.3 Procedures: The Agent’s Behavior Strategies 

Procedures are used to describe the behaviour of an agent. In particular, for each 
goal driving its behavior, a rational agent has a set of procedures (sometimes 
called plans) which can be seen as strategies for achieving the given goal. 

In our language, procedures define the behavior of complex actions. Complex 
actions are defined on the basis of other complex actions, primitive actions, 
sensing actions and test actions. Test actions are needed for testing if some 
fluent holds in the current state and for expressing conditional complex actions. 
They are written as “(Fs)?”, where Fs is a fluent conjunction. A procedure is 
defined as a collection of procedure clauses of the form 



Po is Pi,-.- ,Pn (n > 0) 
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where po is the name of the procedure and pi, i = 1, ... ,n, is either a primitive 
action, or a sensing action, or a test action, or a procedure name (i.e. a procedure 
call)^. Procedures can be recursive and can be executed in a goal directed way, 
similarly to standard logic programs. 

From the logical point of view procedure clauses have to be regarded as axiom 
schemas of the logic. More precisely, each procedure clause po is pi, . . . ,p„, can 
be regarded as the axiom schema"^ {pi){p2) ■ ■ ■ {Pn)‘P 3 {po)‘P- Its meaning is that 
if in a state there is a possible execution of pi, followed by an execution of p2, 
and so on up to p„, then in that state there is a possible execution of po- 

A procedure can contain suggesting actions which are interpreted as inputs 
from the user and, as we will see, allow us to carry on a dialogue with him/her. 



3.4 Planning and Execution 

DyLOG programs are executed by an interpreter which is a straightforward im- 
plementation of the proof procedure. In general, the execution of an action will 
have an effect on the environment, such as, in our case, showing a web page 
to the user. This can be specified in DyLOG by associating with each primitive 
action some code that implements the effects of the action on the world (e.g. in 
our case the agent asks to the actual execution device, a web server, to send a 
given web page to the browser, see Section 4). Therefore, when the interpreter 
executes an action it must commit to it, and it is not allowed to backtrack by 
retracting the effects of the action. Thus procedures are deterministic or at most 
they can implement a kind of “don’t care” non-determinism. 

However, a rational agent must also be able to cope with complex or unex- 
pected situations by reasoning about the effects of a procedure before executing 
it. We can deal with this case by using the language for reasoning about actions 
and, thus, for planning; the agent can do hypothetical reasoning on possible se- 
quences of actions by exploring different alternatives. 

In general, a planning problem amounts to determine, given an initial state, 
if there is a sequence of actions that, when executed in the initial state, leads 
to a goal state in which Fs holds. In our context, in which complex actions can 
be expressed as procedures, we can consider a specific instance of the planning 
problem in which we want to know if there is a possible execution of a procedure p 
leading to a state in which some condition Fs holds. In such a case the execution 
sequence is not an arbitrary sequence of atomic actions but it is an execution 
of p. In other words, the procedure definitions constrain the space in which the 
desired sequence is sought for. This can be formulated by the query {p)Fs, which 
asks for a terminating execution of p (i.e. a finite action sequence) leading to 
a state in which Fs holds. The execution of the above query returns as a side- 
effect an answer which is an execution trace “ai, 02, ... , Om”; such a trace is a 

® Actually in DyLOG pi can also be a Prolog goal. 

^ These axioms have the form of rewriting rules as in grammar logics. In [3] decidability 
results for subclasses of grammar logics are provided, and right regular grammar 
logics are proved to be decidable. 
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[f(500=<650)| 



Fig. 1. The result of the planning process when the user does not have any 
component and has a budget of 650. 



sequence of primitive actions that leads from the initial to the final state and it 
corresponds to a linear plan. 

To achieve this, the DyLOG implementation provides a metapredicate plan(p, 
Fs, as), where p is a procedure, Fs a, condition on the goal state and as a 
sequence of primitive actions. The procedure p can be nondeterministic, and 
plan will extract from it a sequence as of primitive actions, a plan, corresponding 
to a possible execution of the procedure, leading to a state in which Fs holds, 
starting from the current state. Procedure plan works by executing p in the 
same way as the interpreter of the language, with a main differences: primitive 
actions are executed without any effect on the external environment, and, as a 
consequence, they are backtrackable. 

Since procedures can contain sensing (or suggesting) actions, whose outcomes 
are unknown at planning time, all the possible alternatives are to be taken into 
account. Therefore, by applying DyLOG planning predicate to a procedure that 
contains sensing actions we obtain a conditional plan whose branches correspond 
to the possible outcomes of sensing (or suggesting). 
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4 The WLog System 

In this section we describe the agent system that we developed and applied to 
the computer selling case: WLog. 

4.1 Architecture 

The architecture is sketched in Figure 2. In the current implementation, the 
system consists of two kinds of agents: reasoners and executors. Reasoners are 
programs written in DyLOG whereas executors are Java servlets embedded in 
an Apache web server. The connection between the two kinds of agents has the 
form of message exchange. Technical information about the system, the Java 
classes that we defined, and the DyLOG programs (as well as our virtual seller 
example) can be found at http://www.di.unito.it/~alice. 



/D< 



uuukic/URL 

rcwnifini^ 



Java Virtual Machine 



Browser 
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HUpSecsion 
O Servlet 
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A DyLOC Rcasoncr Agent A D^JDG Reaconer Agent 



Fig. 2. A sketch of WLog architecture. 



As explained in the previous sections, the interaction between the user and 
WLog starts with the declaration of the user’s goal, that is, in our case study 
(see Section 2), what (s)he needs a computer for. The goal can either belong to 
a set of alternatives that was defined a priori (such as “multimedia” ) or they can 
be a DyLOG query. The web server (Apache) properly dispatches the requests 
to a free reasoner (if any is available), that from that moment till the end of the 
interaction will be dedicated to serve that specific client. The dispatch and the 
interaction are done by means of the java servlets, that work as an interface. The 
connection with the reasoners can currently be done either by means of sockets 
or by means of Remote Method Invocation but it can easily be extended with 
other kinds of communication systems. 
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Once the goal is passed on to the reasoner, the reasoner produces a condi- 
tional plan aimed at reaching that goal. In our example, it will produce a plan 
for assembling a computer whose main use will be “multimedia” . Each path in 
the plan will correspond to a different computer configuration; however, all con- 
figurations in the conditional plan satisfy the user’s intentions (see, for instance. 
Figure 1). The conditional plan is the web site that can virtually be navigated 
by the user. 

Once built, the plan is executed. Executing a conditional plan implies follow- 
ing one of the paths; only the part of the site corresponding to this path will 
actually be built. The execution of some of the actions in the path consists in 
showing to the user one or more web pages; in particular, those actions (sensing 
and suggesting) that correspond to a branching point, require a feedback from 
the user. Other actions only affect the reasoner’s mental state. 

The actual execution (i.e. producing and showing web pages to the user) 
is a task of the executor (see Figure 2). The most general case is the one in 
which a set of alternatives for a given component is available. In our application 
domain the choice of which to buy is up to the user, so (s)he will be shown an 
HTML page containing the possible alternatives. Thus the interaction between 
the reasoner and the executor is as follows: the reasoner sends to the executor 
a command of the kind “offer CPUl, CPU2” and the executor produces some 
HTML code that contains the information related to the two CPU’s identified by 
CPUl and CPU2 plus the request to make a choice. Once an answer is returned 
from the client, this is passed on to the reasoner, which performs an action that 
is transparent to the client, that consists in adding the new fact to the knowledge 
base and take it into account for passing to the next step: sending to the executor 
the information about which page to show next. All unselected alternatives are 
forgotten. 



4.2 The Virtual Seller: An Example of Agent Program 

In this section we briefly report an example of reasoner, that we used in our 
case study. The behaviour of our selling agent is described by giving a collection 
of procedures. The top level procedure, build.a.computer, produces a plan and 
follows it. 

buildji -Computer is get.user .preferences; get jnax.valuejbudget; , , 

plan{assemble, credit{C) A budget(B) A (C < B), P); P. 

This is done by first interacting with the user in order to find (and adopt) 
his/her goals, by asking what kind of computer the user is interested in, by 
checking if the user has some of the needed components {get. user .preferences), 
and by getting information about budget limitations {get. max. value. budget). 
Second, it plans how to reach the goals, predicting also future interactions 
with the user. Planning is needed to find configurations by taking into ac- 
counts two interacting goals: the goal to assemble a computer satisfying the 
user needs and the goal to consider only configuration affordable by the user’s 
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budget. Observe that the metapredicate plan in (1) corresponds to the query 
{assemble){credit{C) A budget{B) A {C < B)) in the modal logical language. 
Finally, the agent executes the conditional plan P resulted from the planning 
process. 

The way the agent assembles a computer is specified by procedure assemble 
that, until the computer is believed assembled tries to achieve the goal of getting 
a still missing component. 

assemble is assembled?. 

assemble is -^assembled?] achieve.goal] assemble. 

Note that only when all of the goals to get the necessary components are 
fulfilled, the main goal to have a computer to propose to the user is reached 
and the computer is considered assembled. Until there is still a goal to fulfill, 
the computer is considered not assembled (it is expressed by the causal rule: 
^assembled if goal{X)). 

We assume the behaviour of a rational agent to be driven by a set of goals, 
which are represented as fluents having form goal{F). The system detects the 
goals based on user’s inputs and its expert competence about computer configu- 
rations. Initially the reasoner does not have explicit goals, because no interaction 
with the user has been performed. The user’s inputs are obtained after a first 
interaction phase (see (1)) and they generate a set of goals that the agent has to 
achieve to assemble the requested computer. In the language, we model this by 
means of causal rules, by describing the adoption of a goal as the indirect effect 
of requesting user’s preferences. For instance, in getxaser xpref erence the first 
suggesting action, of fer.compmter.type, asks what kind of computer the user 
needs. This action has as an indirect effect the generation of the goal to have a 
computer having those characteristics: 

of fer.compmter.type suggests requested{X) 

goal{has{X)) if requested(X) . 

Let us suppose the agent has been requested to assemble a computer for 
multimedia (the fluent requested{computer {multimedia)) is in the state), then, 
the causal rule above will generate the goal goal {has{compmter {multimedia))). 
This main goal will generate a set of sub-goals to get the needed components to 
built the requested computer by means of the appropriate instantiation of the 
following causal rule: 

goal{has{C)) if goal{has{computer{X))Acomponent{computer{X),C) 

After adopting a goal goal{F), an agent acts so to achieve it until it believes 
the goal is fulfilled (i.e. until it reaches a state where F holds). This corresponds 
to adopt a blind commitment strategy. 

We can now get into the details of procedure achieve. goal, which allows 
the agent to select in a non-deterministic way the goal of adding a component 
(monitor, CPU, RAM or peripheral) to the specific computer that is being built. 
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When the agent has the goal to get a generic component, it has to choose among 
the available types, so it interacts again with the user to decide what specific 
component to add according to the user’s preferences: 



achieve.goal 

achieve.goal 

achieve.goal 

achieve.goal 



is goal{has{monitor{generic)))l; of fer. monitor. type; 

type.monitor(X ) ? ; add{monitor{X ) ) . 
is goal{has{monitor{X)))'l;{X ^ generic)! ;add{monitor{X)). 
is goal{has{ram{generic)))!; of fer.ram.type; 

type.ram{X)l; add{ram{X)) . 
is goal{has{ram{X)))! ; {X yf generic)?; add{ram{X)) . 



Note that the above formulation of the behaviour of the agent, has many 
similarities with agent programming languages based on the BDI paradigm such 
as dMARS [13]. As in dMARS, plans are triggered by goals and are expressed 
as sequences of primitive actions, tests or goals. 



5 Conclusions and Future Work 

In this paper we have presented a new perspective on interface adaptation by 
tackling the problem of the construction of adaptive web sites based on the user’s 
intentions. This approach is orthogonal to the classical approach of focusing on 
the user model and it is our opinion that a real adaptive system should encompass 
both these aspects. We have shown how logic programming languages (and, in 
particular, DyLOG) can be used for this kind of application, that is to define 
the behavior of an agent that builds the web site on demand, according to the 
needs of each of its clients. We think that our approach to adaptation could have 
other interesting applications in the construction of automatic guides for virtual 
museums and help-on-line systems. 

The work that we have presented is in progress. We are currently extending 
WLog by introducing a new agent, whose task is to interact with a Data Base 
that contains all the factual information about the domain, and that serves as an 
interface between such a Data Base and both the reasoners and the executors. 
We are also extending the logical framework in order to model the communi- 
cation among the agents in the system in DyLOG itself. Indeed, a declarative 
specification of the communication would allow to prove correctness properties 
of the interaction among the agents. One last extension that we mean to study 
is to tackle the different kinds of failure that can occur during the interaction 
between a user and the system; in particular, the possibility that the user has 
to refuse the system’s proposals. In these cases replanning should occur. 
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Abstract. We propose a new approach to program web services. Al- 
though we base our approach on the Common Gateway Interface (CGI) 
to ensure wide applicability, we avoid many of the drawbacks and pitfalls 
of traditional CGI programming by providing an additional abstraction 
layer implemented in the multi-paradigm declarative language Curry. 
For instance, the syntactical details of HTML and passing values with 
CGI are hidden by a wrapper that executes abstract HTML forms by 
translating them into concrete HTML code. This leads to a high-level ap- 
proach to server side web service programming where notions like event 
handlers, state variables and control of interactions are available. Thanks 
to the use of a functional logic language, we can structure our approach 
as an embedded domain specific language where the functional and logic 
programming features of the host language are exploited to abstract from 
details and frequent errors in standard GGI programming. 



1 Motivation 

In the early days of the World Wide Web (in the following called the web), 
most of the documents were static, i.e., stored in files which can be viewed in a 
nicely formatted layout. With the introduction of the Common Gateway Inter- 
face (CGI), more and more documents become dynamic, i.e., they are computed 
on the web server at the time they are requested from a client. In combination 
with input forms specified in HTML documents, more complex forms of interac- 
tions become possible so that clients can retrieve or store specific data via their 
web browsers. 

An advantage of CGI is that it is supported by most web servers. Thus, the 
use of CGI does not need any special extensions on the server or the client side 
(e.g., no servlets or cookies), which is a requirement for our development in order 
to ensure wide applicability. On the other hand, CGI offers only a very primi- 
tive form of interaction so that the programming of web services often becomes 
awkward. Although general scripting languages like Perl provide libraries for de- 
coding input form data, they do not support the programmer in the construction 
of correct output data or to control a sequence of interactions with the client. 

* This research has been partially supported by the German Research Gouncil (DFG) 
under grant Ha 2457/1-2 and by the DA AD under the PROCOPE programme. 
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This demands for specialized languages (e.g., MAWL [12], DynDoc [15]) or spe- 
cialized libraries in existing languages (e.g., [2,13,17]). In this paper we take the 
latter approach. We show how the features of a functional logic language (see 
[3] for a survey on this kind of languages) can be exploited to provide a flexible 
and high-level approach to programming web services without any language ex- 
tensions (since our library is completely implemented in Curry). In particular, 
our approach offers the following features for implementing web services: 

— The HTML documents requested by the clients can be flexibly generated 
depending on the computed data. 

— The data filled in a form by the user can be easily retrieved by an environ- 
ment model using logical variables as references. 

— The use of logical variables as references (instead of fixed strings as in “raw” 
CGI) improves the compositionality of HTML forms. 

— The different actions to be taken when a user has completed a form are 
specified by an event handler model. 

— The sequence (or iterations) of interactions with the web server is described 
in one script and not distributed over a set of script files. In particular, a 
form is described together with the handler for this form which avoids typical 
CGI programming errors (e.g., undefined input fields). 

— State variables which should persist between different interactions are di- 
rectly supported. 

— The CGI interaction (usually, by environment variables and value decoding) 
is hidden to the user and encapsulated in a wrapper that translates the 
high-level scripts into HTML code. 

This paper is structured as follows. The next section provides a short overview of 
the main features of Curry as relevant for this paper. Sections 3 and 4 introduce 
our approach for modeling basic HTML documents and interactive forms. Sec- 
tion 5 discusses the use of our programming model by various examples before 
we sketch in Sect. 6 the implementation of our library and conclude in Sect. 7 
with a discussion of related work. 



2 Basic Elements of Curry 

Since we assume familiarity with basic HTML and CGI programming, we review 
in this section only those elements of Curry which are necessary to understand 
the ideas presented in this paper. More details about Curry’s computation model 
and a complete description of all language features can be found in [4,9]. 

Curry is a modern multi-paradigm declarative language combining in a seam- 
less way features from functional programming (nested expressions, lazy evalua- 
tion, higher-order functions), logic programming (logical variables, partial data 
structures, built-in search), and concurrent programming (concurrent evalua- 
tion of expressions with synchronization on logical variables), and supports 
programming-in-the-large with specific features (types, modules, encapsulated 
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search). From a syntactic point of view, a Curry program is a functional pro- 
gram^ extended by the possible inclusion of free (logical) variables in conditions 
and right-hand sides of defining rules. Thus, a Curry program consists of the def- 
inition of functions and the data types on which the functions operate. Functions 
are evaluated in a lazy manner. To provide the full power of logic programming, 
functions can be called with partially instantiated arguments and defined by 
conditional equations with constraints in the conditions. The behavior of func- 
tion calls with free variables depends on the evaluation annotations of functions 
which can be either flexible or rigid. Calls to rigid functions are suspended if a 
demanded argument, i.e., an argument whose value is necessary to decide the ap- 
plicability of a rule, is uninstantiated {“residuation”). Calls to flexible functions 
are evaluated by a possibly non-deterministic instantiation of the demanded ar- 
guments to the required values in order to apply a rule narrowing' . 

Example 1. The following Curry program defines the data types of Boolean val- 
ues and polymorphic lists (first two lines) and functions for computing the con- 
catenation of lists and the last element of a list: 

data Bool = True I False 
data List a = [] la: List a 

cone : : [a] -> [a] -> [a] 
cone eval flex 

cone [] ys = ys 

cone (x:xs) ys = x : cone xs ys 

last xs I cone ys [x] =:= xs = x where x,ys free 

The data type declarations define True and False as the Boolean constants and 
[] (empty list) and : (non-empty list) as the constructors for polymorphic lists 
(a is a type variable ranging over all types and the type “List a” is usually 
written as [a] for conformity with Haskell). 

The (optional) type declaration (“::”) of the function cone specifies that 
cone takes two lists as input and produces an output list, where all list elements 
are of the same (unspecified) type.^ Since cone is explicitly defined as flexible^ 
(by “eval flex”), the equation “cone ys [x] =:= xs” can be solved by in- 
stantiating the first argument ys to the list xs without the last argument, i.e., 
the only solution to this equation satisfies that x is the last element of xs. 

In general, functions are defined by (conditional) rules of the form 
“I I c =e where vs free” where I has the form f t\ . . .tn with / being a func- 
tion, ti,.. . ,tn data terms and each variable occurs only once, the condition c 

^ Curry has a Haskell- like syntax [14], i.e., (type) variables and function names usually 
start with lowercase letters and the names of type and data constructors start with 
an uppercase letter. The application of / to e is denoted by juxtaposition (“/ e”). 

^ Curry uses curried function types where a->(3 denotes the type of all functions 
mapping elements of type a into elements of type /3- 
® As a default, all functions except for constraints are rigid. 
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is a constraint, e is a well- formed expression which may also contain function 
calls, lambda abstractions etc, and vs is the list of free variables that occur in c 
and e but not in I (the condition and the where parts can be omitted if c and vs 
are empty, respectively) . The where part can also contain further local function 
definitions which are only visible in this rule. A conditional rule can be applied 
if its left-hand side matches the current call and its condition is satisfiable. A 
constraint is any expression of the built-in type Success. Each Curry system 
provides at least equational constraints of the form ci = : = 62 which are satisfiable 
if both sides ci and 62 are reducible to unifiable data terms (i.e., terms with- 
out defined function symbols). However, specific Curry systems can also support 
more powerful constraint structures, like arithmetic constraints on real numbers 
or finite domain constraints, as in the PAKCS implementation [7]. 

The operational semantics of Curry, precisely described in [4,9], is a conser- 
vative extension of lazy functional programming (if no free variables occur in the 
program or the initial goal) and (concurrent) logic programming. Due to the use 
of an optimal evaluation strategy [1], Curry can be considered as a generaliza- 
tion of concurrent constraint programming [16] with a lazy (optimal) evaluation 
strategy. Due to this generalization, Curry supports a clear separation between 
the sequential (functional) parts of a program, which are evaluated with an ef- 
ficient and optimal evaluation strategy, and the concurrent parts, based on the 
concurrent evaluation of constraints, to coordinate concurrent program units. 

Monadic I/O: Since web service programs usually interact with their environ- 
ment (e.g., retrieve or store information in files on the server), some knowledge 
about performing I/O in a declarative manner is required. The I/O concept of 
Curry is identical to the monadic I/O concept of Haskell [18], i.e., an interac- 
tive program computes a sequence of actions which are applied to the outside 
world. Actions have type “10 a” which means that they return a result of type 
a whenever they are applied to (and change) the outside world. For instance, 
getChar of type ID Char is an action which reads a character from the standard 
input whenever it is executed, i.e., applied to a world. Similarly, “readFile f” 
is an action which returns the contents of file f in the current world. Actions 
can only be sequentially composed. For instance, the action getChar can be 
composed with the action putChar (which has type Char -> 10 () and writes 
a character to the terminal) by the sequential composition operator >>= (which 
has type 10 a -> (a -> 10 /?) -> 10 /?), i.e., “getChar >>= putChar” is a 
composed action which prints the next character of the input stream on the 
screen. Finally, “return e” is the “empty” action which simply returns e (see 
[18] for more details). 

3 Modeling Basic HTML 

In order to avoid certain syntactical errors (e.g., unbalanced parenthesis) during 
the generation of HTML documents by a web server, the programmer should not 
be forced to generate the explicit text of HTML documents (as in CGI scripts 
written in Perl or with the Unix shell). A better approach is the introduction 




80 



Michael Hanus 



of an abstraction layer where HTML documents are modeled as terms of a 
specific data type together with a wrapper function which is responsible for the 
correct textual representation of this data type. Such an approach can be easily 
implemented in a language supporting algebraic data types (e.g., [13]). Thus, we 
introduce the type of HTML expressions in Curry as follows: 

data HtmlExp = HtmlText String 

I HtmlStruct String [(String, String)] [HtmlExp] 

I HtmlElem String [(String, String)] 



Thus, an HTML expression is either a plain string or a structure consisting of 
a tag (e.g., B,EM,H1,H2,. . . ), a list of attributes, and a list of HTML expressions 
contained in this structure. The translation of such HTML expressions into their 
corresponding textual representation is straightforward: an HtmlText is repre- 
sented by its argument, and a structure with tag t is enclosed in the brackets 
<t> and </t> (where the attributes are eventually added to the open bracket). 
Since there are a few HTML elements without a closing tag (like <HR> or <BR>), 
we have included the alternative HtmElem to represent these elements. 

Since writing HTML documents in this form might be tedious, we define 
several functions as useful abbreviations (htmlQuote transforms characters with 
a special meaning in HTML, like <, >, &, ", into their HTML quoted form): 



htxt s 
hi hexps 
bold hexps 
italic hexps 
hrule 



HtmlText (htmlQuote s) 


— plain string 


HtmlStruct "HI" 


[] 


hexps 


— main header 


HtmlStruct "B" 


[] 


hexps 


— bold font 


HtmlStruct "1" 


[] 


hexps 


— italic font 


HtmlElem "HR" [] 






— horizontal rule 



As a simple example, the following expression defines a “Hello World” document 
consisting of a header and two words in italic and bold font, respectively: 

[hi [htxt "Hello World"] , 

italic [htxt "Hello"], bold [htxt "world!"]] 



4 Input Forms 

In order to enable more sophisticated interactions between clients using standard 
browsers and a web server, HTML defines so-called FORM elements which usually 
contains several input elements to be filled out by the client. When the client 
submits such a form, the data contained in the input elements is encoded and 
sent (on the standard input or with the URL) to the server which starts a CGI 
program to react to the submission. The activated program decodes the input 
data and performs some application-dependent processing before it returns an 
HTML document on the standard output which is then sent back to the client. 

In principle, the type HtmlExp is sufficient to model all kinds of HTML doc- 
uments including input elements like text fields, check buttons etc. For instance, 
an input field to be filled out with a text string can be modeled as 
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HtmlElem "INPUT" [("TYPE" , "TEXT") , ("NAME" .name) , ("VALUE" , cent)] 

where the string cent defines an initial contents of this field and the string 
naune is used to identify this field when the data of the filled form is sent to 
the server. This direct approach is taken in CGI libraries for scripting languages 
like Perl or also in the CGI library for Haskell [13]. In this case, the program 
running on the web server is an I/O action that decodes the input data (con- 
tained in environment variables and the standard input stream) and puts the 
resulting HTML document on the output stream. Therefore, CGI programs can 
be implemented in any programming language supporting access to the system 
environment. However, this basic view results in an awkward programming style 
when sequences of interactions (i.e., HTML forms) must be modeled where state 
should be passed between different interactions. Therefore, we propose a higher 
abstraction level and we will show that the functional and logic features of Curry 
can be exploited to provide an appropriate programming infrastructure. There 
are two basic ideas of our programming model: 



1. The input fields are not referenced by strings but by elements of a specific 
abstract data type. This has the advantage that the names of references 
correspond to names of program variables so that the compiler can check 
inconsistencies in the naming of references. 

2. The program that is activated when a form is submitted is implemented to- 
gether with the program generating the form. This has the advantage that 
sequences of interactions can be simply implemented using the control ab- 
stractions of the underlying language and state can be easily passed between 
different interactions of a sequence using the references mentioned above. 

For dealing with references to input fields, we use logical variables since it is well 
known that logical variables are a useful notion to express dependencies inside 
data structures [6,19]. To be more precise, we introduce a data type 

data CgiRef = CgiRef String 

denoting the type of all references to input elements in HTML forms. This data 
type is abstract, i.e., its constructor CgiRef is not exported by our library. This 
is essential since it avoids the construction of wrong references. The only way to 
introduce such references are logical variables, and the global wrapper function 
is responsible to instantiate these variables with appropriate references (i.e., 
instantiate each reference variable to a term of the form CgiRef n where n is a 
unique name). 

To include references in HTML forms, we extend the definition of our data 
type for HTML expressions by the following alternative: 

data HtmlExp = . . . I HtmlCRef HtmlExp CgiRef 

A term “HtmlCRef hexp cr” denotes an HTML element hexp with a reference 
to it. Usually, hexp is one of the input elements defined for HTML, like text 
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fields, text areas, check boxes etc. For instance, a text field is defined by the 
following abbreviation in our library:^ 

textfield : : CgiRef -> String -> HtmlExp 

textfield eval flex 

textfield (CgiRef ref) contents = 

HtmlCRef (HtmlElem "INPUT" [("TYPE" , "TEXT") , ("NAME" , ref ) , 

("VALUE" .contents)] ) 

(CgiRef ref) 

Note that ref is unbound when this function is applied but it will be bound to 
a unique name (string) by the wrapper function executing the form (see below) . 

A complete HTML form consists of a title and a list of HTML expressions to 
be displayed by the client’s browser, i.e., we represent HTML forms as expres- 
sions of the following data type: 

data HtmlForm = Form String [HtmlExp] 

Thus, we can define a form containing a single input element (a text field) by 

Form "Form" [hi [htxt "A Simple Form"] , 

htxt "Enter a string:", textfield sref ""] 

In order to submit a form to the web server, HTML supports “submit” buttons 
(we only discuss this submission method here although there are others). The 
actions to be taken are described by CGI programs that decode the submitted 
values of the form before they perform the appropriate actions. To simplify these 
actions and combine them with the program generating the form, we propose 
an event handling model for CGI programming. For this purpose, each submit 
button is associated with an event handler responsible to perform the appropriate 
actions. An event handler is a function from a CGI environment into an I/O 
action (in order to enable access to the server environment) that returns a new 
form to be sent back to the client. A CGI environment is simply a mapping from 
CGI references into strings. When an event handler is executed, it is supplied 
with a CGI environment containing the values entered by the client into the 
form. Thus, event handlers have the type 

type EventHandler = (CgiRef -> String) -> 10 HtmlForm 

To attach an event handler to an HTML element, we finally extend the definition 
of our data type for HTML expressions by: 

data HtmlExp = . . . I HtmlEvent HtmlExp EventHandler 

A term “HtmlEvent hexp handler” denotes an HTML element hexp (typically 
a submit button) with an associated event handler. Thus, submit buttons are 
defined as follows: 



^ Note that this function must be flexible so that the first argument, which can only 
be a logical variable, is instantiated by the application of this function. 
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Fig. 1. A simple string reverse/duplication form 



button : : String -> EventHandler -> HtmlExp 
button txt handler = 

HtmlEvent (HtmlElem "INPUT" [("TYPE" , "SUBMIT") , ("NAME" , "EVENT") , 

("VALUE" , txt)] ) handler 

The argument txt is the text shown on the button and the attribute NAME is 
later used to identify the selected submit button (since several buttons can occur 
in one form, see Sect. 6). 

To see a simple but complete example, we show the specification of a form 
where the user can enter a string and choose between two actions (reverse or 
duplicate the string, see Figure 1):® 

revdup = return $ Form "Question" 

[htxt "Enter a string: ", textfield tref hrule, 
button "Reverse string" revhandler, 
button "Duplicate string" duphandler] 

where 

tref free 

revhandler env = return $ Form "Answer" 

[hi [htxt ("Reversed input: " ++ reverse (env tref))]] 

duphcindler env = return $ Form "Answer" 

[hi [htxt ("Duplicated input: " ++ env tref ++ env tref)]] 

Note the simplicity of retrieving values entered into the form: since the event 
handlers are called with the appropriate environment containing these values, 
they can easily access these values by applying the environment to the appro- 
priate CGI reference, like (env tref). This structure of CGI programming is 
made possible by the functional as well as logic programming features of Curry. 

Forms are executed by a special wrapper function that performs the trans- 
lation into concrete HTML code, decoding the entered values and invoking the 
correct event handler. This wrapper function has the following type: 

runcgi : : String -> ID HtmlForm -> ID () 

® The predefined right-associative infix operator / $ e denotes the application of / to 
the argument e. 
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It takes a string (the URL under which this CGI program is accessible on the 
server) and an I/O action returning a form and returns an I/O action which, 
when executed, returns the HTML code of the form. Thus, the above form is 
executed by the following main function 

main = runcgi "revdup.cgi" revdup 
provided that the executable of this program is stored in revdup . cgi. 



5 Server Side Web Scripting 

In this section we will show by various examples that the components for web 
server programming introduced so far (i.e., logical variables for CGI references, 
associated event handlers depending on CGI environments) are sufficient to solve 
typical problems in CGI programming in an appropriate way, like handling se- 
quences of interactions or holding intermediate states between interactions. 



5.1 Accessing the Web Server Environment 

From the previous example it might be unclear why the event handlers as well 
as the wrapper function assumes that the form is encapsulated in an I/O action. 
Although this is unnecessary for applications where the web server is used as 
a “computation server” (where the result depends only on the form inputs), in 
many applications the clients want to access or manipulate data stored on the 
server. In these cases, the web service program must be able to access the server 
environment which is easily enabled by running it in the I/O monad. 

As a simple example for such kinds of applications, we show the definition of 
a (not recommendable) form to retrieve the contents of an arbitrary file stored 
at the server: 

getfile = return $ Form "Question" 

[htxt "Enter local file name:", textfield fileref 
button "Get file!" handler] 

where 

fileref free 

handler env = readFile (env fileref) >>= \contents -> 
return $ Form "Answer" 

[hi [htxt ("Contents of " ++ env fileref)] , 
verbatim contents] 

Here it is essential that the event handler is executed in the I/O monad, otherwise 
it has no possibility to access the contents of the local file via the I/O action 
readFile before computing the contents of the returned form. In a similar way, 
arbitrary data can be retrieved or stored by the web server while executing CGI 
programs. 
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5.2 Interaction Sequences 

In the previous examples the interaction between the client and the web server 
is quite simple: the client sends a request by filling a form which is answered 
by the server with an HTML document containing the requested information. 
In realistic applications it is often the case that the interaction is not finished 
by sending back the requested information but the client requests further (e.g., 
more detailed) information based on the received results. Thus, one has to deal 
with sequences of longer interactions between the client and the server. 

Our programming model provides a direct support for interaction sequences. 
Since the answer provided by the event handler is an HTML form rather than 
an HTML expression, this answer can also contain further input elements and 
associated event handlers. By nesting event handlers, it is straightforward to im- 
plement bounded sequences of interactions and, therefore, we omit an example. 

A more interesting question is whether we can implement other control ab- 
stractions like arbitrary loops. For this purpose, we show the implementation of 
a simple number guessing game: the client has to guess a number known by the 
server, and for each number entered by the client the server responds whether 
this number is right, smaller or larger than the number to be guessed. If the 
guess is not right, the answer form contains an input field where the client can 
enter the next guess. 

Due to the underlying declarative language, we implement looping constructs 
by recursion. Thus, the event handler computing the answer for the client con- 
tains a recursive call to the initial form which implements the interaction loop. 
The entire implementation of this number guessing game is as follows: 

guessform = return $ Form "Number Guessing" guessinput 
guess input = 

[htxt "Guess a number: ", textfield nref 
button "Check" (guesshandler nref)] where nref free 

guesshandler nref env = 
let nr = readint (env nref) 
in return $ Form "Answer" 

(if nr==42 

then [htxt "Right!"] 

else [htxt (if nr<42 then "Too small!" else "Too large!"), 
hrule] ++ guessinput) 

guessinput is an HTML expression corresponding to the initial form which 
contains an input field for entering the client’s guess, guesshandler is the as- 
sociated event handler where the CGI reference to the input field is the first 
argument of the handler. It checks the number entered by the client (readint 
converts a string into a number) and returns the different answers depending on 
the client’s guess. If the guess is not right, the guessinput is appended to the 
answer which implements the recursive call. 

It should be clear that this general recursion pattern can be extended in 
various ways. For instance, counting the number of guesses made by the client is 
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quite simple: the only change to the above program is the addition of a counter 
argument to guessinput and guesshandler which is initialized in the main 
function guessf orm and incremented in each recursive call. 



5.3 Handling Intermediate States 

A nasty problem in many CGI applications is the handling of intermediate states 
due to the fact that HTTP is a stateless protocol. For instance, in electronic com- 
merce applications, the clients have shopping baskets where the already selected 
items are stored, and the contents of these baskets must be kept between the 
interactions. Storing this information on the server side has several drawbacks. 
For instance, the client wants to identify himself only after he really orders the 
items, i.e., during the selection phase the server cannot uniquely associate the se- 
lections to a client. Furthermore, the client might not proceed with his selections 
so that the server does not know whether the basket information can be deleted 
(which is necessary at some point to avoid a memory overflow). Therefore, it is 
often better to store such client-dependent information on the client side. For 
this purpose, one can have HTML forms with input elements of type HIDDEN 
which have no visual representation but can be used to pass client-dependent 
information between interactions. “Raw” HTML/CGI programmers must ex- 
plicitly handle these fields which is awkward and a source of many programming 
problems. 

Our programming model offers a much simpler solution to this problem. By 
nesting event handlers (which is allowed in languages with lexical scoping like 
Gurry), one can directly refer to input elements in previous forms. To be more 
concrete, we consider a sequence of HTML forms where the client enters his first 
name in the first form and his last name in the second form. The complete name 
is returned in the third form. This example can be implemented as follows: 

nauneform = return $ Form "First Name Form" 

[htxt "Enter your first name: ", textfield first 
button "Continue" f handler] 

where first free 

fhandler _ = 

return $ Form "Last Name Form" 

[htxt "Enter your last name: ", textfield last 
button "Continue" lhandler] 

where last free 

lhandler env = return $ Form "Answer" 

[htxt ("Hi, " ++ env first ++ " " ++ env last)] 

Note that, due to lexical scoping, the variable first is visible in the lhandler 
without explicitly passing it as an argument. 
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5.4 Improving Compositionality 

It is well known that an advantage of functional programming is the direct sup- 
port for building application-oriented abstractions, thus, increasing modularity 
[10]. Unfortunately, “raw” CGI as well as functional libraries for CGI program- 
ming as [13] do not support compositionality in CGI programming due to the 
use of fixed strings for identifying form elements. In the following, we will show 
that our approach to web service programming improves compositionality by 
exploiting the functional and logic features of the base language. 

As an example, consider that we want to add to each web page of a set of 
dynamic web pages a search field where the client can retrieve some specific 
information, e.g., the email address of a person. It is reasonable to define for this 
purpose a sequence of HTML elements abstracting such a search field together 
with its event handler. In our approach, this can be implemented as follows: 

emailSearch = 

[hrule, htxt "Enter a name: ", textfield nref 
button "search email" lookup, hrule] 

where nref free 

lookup env = . . .getEmail (env nref) . . . 

The code for the event handler lookup is not completely shown since this depends 
on accessing a data base containing the email addresses.® The important point is 
that the abstraction emailSearch can be used as any other sequence of HTML 
elements without taking care of the names of the input fields since the field 
identifier nref is a local variable in emailSearch and, thus, not visible outside 
this abstraction. For instance, the HTML sequence 

[. . . , textfield nref . . .] ++ emailSearch ++ ... 

causes no name clash between the different field identifiers due to the lexical 
scoping of the underlying programming language. This is not true in “raw” CGI 
programming where the programmer has to be careful about the selection of field 
names to avoid potential name conflicts (which can result in nasty programming 
errors).^ This example shows the improved compositionality by our abstraction 
layer for web service programming. 

6 Implementation 

Our library for web service programming is completely implemented in Curry. It 
does not require any extension to web servers but uses only the standard features 
of CGI. Since these are supported by most web servers, our library can be used 
with most web servers (where a Curry system is also installed). In this section, 

® For instance, this can be easily done by sending a message to an address server using 
the features for distributed programming in Curry [5]. 

^ Although one can use several forms in one HTML document to avoid name conflicts, 
this does not work in general if some input fields should be shared. 
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we discuss the implementation of our programming model with CGI references 
and event handlers on top of the standard CGI features. 

The entire implementation is performed by the main wrapper function 
runcgi which basically takes a specification of an HTML form and translates it 
into the corresponding concrete HTML text. Moreover, it performs the following 
tasks: 

— Assigning unique identifiers (strings) to the CGI references occurring in the 
form specification, i.e., the logical variables in the CGI references are instan- 
tiated to these string identifiers. 

— Assigning unique identifiers (strings) to each event handler occurring in the 
form specification. For instance, each submit button contains after this as- 
signment a name attribute of the form EVENT_s, where s is a string uniquely 
identifying the event handler associated to the button that the client has 
pressed to submit the form. 

— Adding the input values of the previous (enclosing) forms as hidden inputs. 

If a web server receives a request to execute a service implemented with our 
library, it executes the wrapper function runcgi applied to the corresponding 
form (compare end of Sect. 4) in the environment of the web server. Thus, runcgi 
first checks the environment variables in order to decode the list of input values 
entered by the user (which might be empty for the initial form). If there is no 
input value named EVENT_s, then this is the call of the top-level form and not a 
submission of a previous form. In this case, the top-level form is translated and 
written on the standard output stream so that the web server returns it to the 
client. If there is an input value identifying the selected handler (i.e., the name 
EVENT_s is defined in the input environment), runcgi selects the associated event 
handler in the form specification and executes it together with the current CGI 
environment as an argument. 

The current CGI environment is computed as follows. First, the list of 
name/ value pairs passed in a string representation to the CGI program is de- 
coded and stored in a list of pairs of strings. The selection of the value associated 
to a CGI reference in this list is implemented by a simple list lookup function 

cgiGetValue :: [(String, String)] -> CgiRef -> String 

If cenv denotes the current list of decoded name/value pairs, the cor- 
responding CGI environment can be computed by the partial application 
(cgiGetValue cenv) which has the required type CgiRef -> String. Although 
the implementation of environments can be improved by more sophisticated data 
structures (e.g., balanced search trees), our practical experience indicates that 
this simple implementation is sufficient. 



7 Conclusions and Related Work 

In this paper we have presented a new model for programming web services 
based on the standard Common Gateway Interface. Since this model is put 
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on top of the multi-paradigm language Curry, we could exploit functional as 
well as logic programming techniques to provide a high abstraction level for our 
programming model. We have used functional abstractions for specifying HTML 
forms as expressions of a specific data type so that only well-formed HTML 
structures can be written. Furthermore, higher-order functional abstractions are 
used to attach event handlers to particular HTML elements like buttons and to 
provide a straightforward access to input values via an environment model. Since 
event handlers can be nested, we have a direct support to define sequences (or 
sessions) of interactions between the client and the server where states or input 
values of previous forms are available in subsequent interactions. This overcomes 
the stateless nature of HTTP. On the other hand, the logical features of Curry 
are used to deal with references to input values in HTML forms. Since a form 
can have an arbitrary number of input values, we consider them as “holes” in an 
HTML expression which are filled by the user so that event handlers can access 
these values through an environment. Using logical variables to refer to input 
values is more appropriate than the use of strings as in raw HTML since some 
errors (e.g., mispelled names) are detected at compile time and HTML forms 
can be composed without name clashes. 

Since Curry has more features than used in the examples of this paper, we 
shortly discuss the advantages of using them. Curry subsumes logic program- 
ming, i.e., it offers not only logical variables but also built-in search facilities 
and constraint solving. Thus, one can easily provide web services where con- 
straint solving and search is involved (e.g., web services with a natural language 
interface), as shown in the (purely logic-based) PiLLoW library [2]. Since event 
handlers must be deterministic functions, the encapsulation of search in Curry 
[8] becomes quite useful for such kinds of applications. Furthermore, Curry ex- 
ploits the logic programming features to support concurrent and distributed 
programming by the use of port constraints [5]. This can be used to retrieve 
information from other Internet servers (as done in the web pages for Curry to 
generate the members of the Curry mailing list® where the web server interacts 
with a database server). 

Finally, we compare our approach with some other proposals for providing a 
higher level for web programming than the raw CGI. MAWL [12] is a domain- 
specific language for programming web services. In order to allow the checking 
of well-formedness of HTML documents, in MAWL documents are written in 
HTML with some gaps that are filled by the server before sending the document 
to the client. Since these gaps are filled only with simple values, the generation of 
documents whose structure depends on some computed data is largely restricted. 
To overcome this restriction, MAWL offers special iteration gaps which can be 
filled with list values but more complex structures, like unbounded hierarchical 
structures, are not supported in contrast to our approach. On the positive side, 
MAWL has a special (imperative) language to support the handling of sequences 
of interactions with traditional imperative control structures and the manage- 
ment of state variables. However, the programming model is different than ours. 
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In MAWL the presentation of an HTML document is considered as a remote 
procedure call in the sequence of interaction statements. Therefore, there is ex- 
actly one program point to continue the handling of the client’s answer where 
our model allows several event handlers that can be called inside one document 
(see the form revdup in Sect. 5). 

The restrictions of MAWL to create dynamic documents have been weakened 
in DynDoc [15] that supports higher-order document templates, i.e., the gaps 
in a document can be filled with other documents that can also contain gaps. 
Thus, unbounded hierarchically structured documents can be easily created. In 
contrast to our approach, DynDoc is based on a specific language for writing 
dynamic web services while we exploit the features of the existing high-level 
language Curry for the same task so that we can immediately use all features 
and libraries for Curry to write web applications, like higher-order functions, 
constraints, ports for distributed programming etc. 

Similarly to our library-based approach, there are also libraries to support 
HTML and CGI programming in other functional and logic languages. Meijer 
[13] has developed a CGI library for Haskell that defines a data type for HTML 
expressions together with a wrapper function that translates such expressions 
into a textual HTML representation. However, it does not offer any abstrac- 
tion for programming sequences of interactions. These must be implemented 
in the traditional way by choosing strings for identifying input fields, passing 
states as hidden input fields etc. Similarly, the representation of HTML doc- 
uments in Haskell proposed by Thiemann [17] concentrates only on ensuring 
the well-formedness of documents and do not support the programming of in- 
teractions. Nevertheless, his approach is interesting since it demonstrates how 
a sophisticated type system can be exploited to include more static checks on 
the document structure, in particular, to check the validity of the attributes 
assigned to HTML elements. Hughes [11] proposes a generalization of monads, 
called arrows, to deal with sequences of interactions and passing state in CGI 
programming but, in contrast to our approach, his proposal does not contain 
specific features for dealing with references to input fields. The PiLLoW library 
[2] is an HTML/CGI library for Prolog. Due to the untyped nature of Prolog, 
static checks on the form of HTML documents are not supported. Furthermore, 
there is no higher-level support for sequences of interactions. 

Since the programming model proposed in this paper needs no specific exten- 
sion to Curry, it provides appropriate support to implement web-based interfaces 
to existing Curry applications. Moreover, it can be considered as a domain- 
specific language for writing web service scripts. Thus, this demonstrates that a 
multi-paradigm declarative language like Curry can also be used as a scripting 
language for server side web applications. We have shown that the functional as 
well as the logic features provide a good infrastructure to design such a domain- 
specific language. The implementation of this library is freely available with 
our Curry development system PAKCS [7]. All examples in this paper are ex- 
ecutable with this implementation. Furthermore, the library is currently used 
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to dynamically create parts of the web pages for Curry®, to handle the submis- 
sion information for the Journal of Functional and Logic Programming^®, and 
for correcting the student’s home assignments in the introductory programming 
lecture in our department (among others). 

Although our programming model and its implementation works well in all 
these applications, it might be interesting for future work to provide alternative 
implementations with specialized infrastructures (e.g., servlets, security layers 
etc) for the same programming model. 
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Abstract. We explore the use of a number of Logic Programming tech- 
niques for generating dynamic Web content and the underlying architec- 
ture and implementation issues. We describe a Prolog to VRML map- 
ping allowing generation of dynamic VRML pages through CGI and 
server side Prolog scripts. BinProlog’s Assumption Grammars (a form 
of multi-stream DGGs with implicit arguments and temporary asser- 
tions, scoping over the current AND-continuation) are used to mimic 
VRML syntax and semantics directly, without using a preprocessor. The 
resulting generator allows quick development of integrated knowledge 
processing and data visualization Web applications. Using BinProlog’s 
multi-threaded networking primitives, we describe a design integrating 
in a self-contained Prolog application a Web Server, a Data Extraction 
module and an Assumption Grammar based VRML generator. 
Keywords: Internet Programming with Prolog, Dynamic VRML con- 
tent, Web Arehitectures, Logic Programming Tools, Prolog based Client- 
Server Programming, Prolog Networking, Definite Clause Grammars 



1 Introduction 

Generating dynamic Web content is becoming increasingly important as a result 
of automation of Web-mediated exchanges between heterogenous information 
sources, with meta-search, Web-clipping, Web-database connectivity, visualisa- 
tion of complex dynamic data now present in almost any major Web application. 

VRML (Virtual Reality Modeling Language) [13, 10, 2, 11] is a de facto 
Internet standard for 3D visualization. Its applications range from modeling 
unique objects like the International Space Station, to shared virtual reality and 
3D visualization of complex data. 

VRML does not provide extensive programming language constructs, and 
using its embedded JavaScript-like scripting language is cumbersome. Although 
an External Authoring Interface [22] allows controlling VRML worlds from Java, 
the combination of multiple layers running within a browser makes EAI appli- 
cations extremely fragile. 



I.V. Ramakrishnan (Ed.): PADL2001, LNCS 1990, pp. 93—107, 2001. 
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Prolog is a flexible and powerful AI language, particularly well suited for 
knowledge processing, Internet data extraction, planning, machine learning, hy- 
pothetical reasoning etc. 

This has motivated us to combine the two technologies to allow the integra- 
tion of our Prolog based Internet data extraction tools and VRML’s 3D visual- 
ization and animation capabilities. 

As a case study in generation of dynamic Web content, we describe a Prolog 
to VRML translator, which allows generation of dynamic VRML pages through 
CGI and server side Prolog scripts, following the technique introduced in [15]. 

Assumption Grammars [20, 5] are multi-stream DCGs with implicit argu- 
ments and temporary assertions, scoping over the current AND-continuation^ . 
We will use them to mimic VRML syntax and semantics with an Assumption 
Grammar based VRML generator written in BinProlog. The resulting “scripting 
language” will allow quick development of integrated knowledge processing and 
high quality data visualization applications. Using BinProlog’s multi-threading, 
we will integrate in a self-contained Prolog application a Web Server, a Data Ex- 
traction module, and an Assumption Grammar based VRML generator. From 
the user’s point of view our integrated set of tools looks like a dynamic VRML 
page which reacts to user input through a set of HTML forms and VRML an- 
chors^ . This allows the user to provide some parameters or click on sensor enabled 
VRML areas and see as a result the output of our server side VRML generator, 
displayed by the browser as a 3D animation. 

2 Overview of Assumption Grammars 

As Assummption Grammars [20, 5] support multiple-stream DCG grammars, 
implemented as an ADT, instead of the preprocessing technique used in most 
Prologs, they are particularily useful in the context of Prolog based Internet 
programming, where multiple DCG streems are needed for parsing or generating 
Web pages - code/decode Prolog terms over socket connections or for natural 
language query processing. 

We will now overview the set of basic operations which are part of BinProlog’s 
Asummption Grammar API. 



2.1 Assumed Code, Intuitionistic and Linear Implication 

Intuitionistic */l adds temporarily a clause usable in later proofs, through calls 
to -/I. Such a clause can be used an indefinite number of times, mostly like 
asserted clauses, except that it vanishes on backtracking. Its scoped version => 
(intuitionistic implication [8] ) as used in 

Clause=>Goal or [File]=>Goal 

^ This allows producing and consuming backtrackable Prolog assertions as a means to 
exchange information between arbitrary points of the parsing process. 

^ Hyperlinks attached to VRML nodes. 
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makes Clause or the set of clauses found in File available only during the proof 
of Goal. Both vanish on backtracking. Clauses are usable an indefinite number 
of times in the proof, i.e. for instance 

?- *a(13),-a(X),-a(Y) 

will succeed, as a(13) is usable multiple times, and therefore matches both a(X) 
and a(Y) . 

The linear assumption operator +/ 1 adds temporarily a clause usable at 
most once in later proofs, through use of -/I. This assumption also vanishes 
on backtracking. Its scoped version (linear implication [9, 7]) as used in 

Clause-: Goal or [File]-: Goal 

makes Clause or the set of clauses found in File available only during the proof 
of Goal. Assumptions vanish on backtracking and are accessible only inside their 
stated scope. Each clause is usable at most once in the proof, i.e. for instance 

?- +a(13),-a(X),-a(Y). 

fails as a (13) is usable only once and therefore consumed after matching a(X). 

One can freely mix linear and intuitionistic clauses and implications for the 
same predicate, for instance as in: 

?-a(10)-:a(ll)=>a(12)-:a(13)=>(-a(X) ,-a(X)) . 

X=ll ; 

X=13 ; 
no 



At the time a(X) is called with -a(X) we have the assumed facts a(10), 
a(ll), a(12) and a(13), as implications are embedded left to right. Given that 
X is consumed twice, only the values 11 and 13 assumed with intuitionistic 
implications => will be returned as solutions. The (two) attempts to consume 
twice the values a (10) and a (12), assumed with linear implication, will produce 
no solutions. 



2.2 The Assumption Grammar API 

Assumption Grammars [20, 5] are logic programs augmented with linear and 
intuitionistic implications scoped over the current continuation, and implicit 
multiple DGG accumulators^. 

® An accumulator is a sequence of chained logic variables, for instance S1,S2,S3 as 
used in the DCG rule a — >b,c which expands to a(Sl , S3) : -b(Sl , S2) , c (S2 ,S3) . 
Assumption Grammars provide the equivalent of an arbitrary number of such vari- 
able chains - through and internal implementation using backtrackable destructive 
assignment, instead of the program transformation approach used to implement sin- 
gle stream DGGs. 
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The complete Assumption Grammar API consists of 3 assumption operators: 

*A: adds intuitionistic assumption A 
+A: adds linear assumption A 

-A: matches an assumption in current scope or fails 
and 3 DCG-stream operators. 

Implicit DGG-streams can be seen as providing implicit input and output 
sequence arguments. While operationally equivalent to program transformation 
based DGGs, no preprocessing is required, as terminals interact with the implicit 
DGG stream directly, through a set of BinProlog built-ins: 

#<Tokens : initializes/unifies the DCG stream with a list of Tokens 
#Token: matches or inserts a Token in the implicit DCG stream 
#>State : returns the current State of the implicit DCG stream 

The following pattern explains the use of implicit accumulators in match- 
ing/recognition mode: 

?- #< [a,b,c,d] ,#A,#B,#>LeftOver . 

A=a, 

B=b, 

LeftOver= [c , d] 

For generation mode we start with an unbound value Xs for the implicit DGG 
stream, then we bind it to list elements a,b,c by stating that a,b,c are termi- 
nals. 

?- #<Xs,#a,#b,#c,#>[] . 

Xs= [a,b, c] 

The combination of DGG stream operations and assumptions is particularly 
well suited for generating hierarchical data structures where properties hold for 
selected subobjects. 

3 An Overview of VRML 

VRML’97 is a hierarchical scene description language which is human readable 
[13, 2]. The VRML scene is composed of nodes and routes. Each node can have 
a unique name for routing or re-use, a type, a number of associated events (both 
incoming and outgoing), a number of fields which hold the data for the node. In 
addition, group nodes can have many other nodes as children. One special node 
(script) can hold a Javascript program to control behavior in the scene. Group 
nodes can hold many other nodes, associating them in some fashion. Positioning 
of objects in the world is achieved through the transform node. The transform 
node applies scale operations, rotations and translations to its children. To 
achieve complex repeated transforms, transform nodes may be nested. Sensors 
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can be thought of in two categories, those associated with geometry and those for 
sensing time. The sensors can respond to various conditions such as proximity 
to an avatar, mouse clicks, mouse movement or whether an object is in the 
field of view of the user. Each interpolator node has a set of keys and a set of 
key Values. When a setJraction eventin is received, the values are interpolated 
to get the correct value_changed eventOut. Prototypes are the mechanism used 
for expanding the VRML language. A prototype has a new type name followed 
by a list of fields and events in brackets. Interacting with VRML worlds or 
animating them will cause events to be generated. These are sent from one node 
to another. Here are some typical event sequence patterns: usually a Sensor of 
some kind generates an event to start the animation. Sometimes processing of 
events is required which can be done through a simple script node. For non- 
reactive subcomponents (for instance to run an animation or to supply a stream 
of events for a color interpolation) Time Sensors can be used. 

VRML’s hierarchical space representation suggests a natural mapping to 
Prolog terms. Interestingly enough, both VRML and Prolog describe atempo- 
ral structures - space in the case of VRML and logical axioms in the case of 
Prolog. To some extent, VRML’s declarative space representation and Prolog’s 
declarative truth representation share the same difficulties when faced with the 
problem of representing time and change. Not surprisingly, VRML’s event propa- 
gation mechanism (needed for animation) has a strong declarative flavor. In fact, 
it is quite similar to Prolog’s DCG argument chaining: events are propagated 
with ROUTE statements between chained nodes. 

3.1 Syntactical Similarities between Prolog and VRML 

VRML’s concrete syntax is closer to languages like Life [1] than to Prolog, i.e. 
all arguments are named, not positional. However, by using compound terms 
tagged with VRML’s attribute names and some user defined operators (like 0) 
we can easily mimic VRML syntax in Prolog. 

Syntax Mapping of Some Key Idioms Our mapping is so simple that it 
does not even need a preprocessor. Instead, we use Prolog’s reader with some 
prefix and infix operator definitions such that a standard Prolog reader accepts 
constructs like 0 [ . . . ] and 0{ . . . } as terms. As the same applies to Assumption 
Grammars, this pseudo- VRML notation turns out to be just plain Prolog. The 
details of the mapping are as follows: 

Prolog VRML 



0 { } ===> { } 

0[ ] ===> [ ] 
f(a,b,c) ==> f a b c 
a is b ==> a ’IS’ b 
a=b ==> a b 
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The key idea behind this mapping is to allow writing unrestricted VRML code 
in Prolog, while being able to interleave it with “compile time” Prolog computa- 
tions. The following example illustrates how to build pseudo VRML in Prolog, 
in the (interesting) case when a VRML PROTO construct is generated, which 
in turn can be used to model multiple instances of VRML objects. 

proto anAppearance @ [ 

exposedFieldC ’ SFColor ’ )=color (1 ,0,0) , 
exposedFieldC ’MFString’ )=texture@ [] 

] 

’Appearance’ @{ 

material= ’ Material ’ 
diffuseColor is color 

>, 

texture=’lmageTexture’ 
url is texture 

} 

} 

}. 

The reader will notice that the resulting VRML code is in fact a close syntactic 
variant of its Prolog template. 

#VRML V2.0 utf8 

PROTO anAppearance 

[ 

exposedField SFColor color 100, 
exposedField MFString texture [] 

] { 

Appearance { material 

Material { diffuseColor IS color } 
texture ImageTexture { 
url IS texture 

> 

} 

> 

3.2 Building Prolog Macros 

While VRML’s PROTO constructs allow significant code reuse, it is sometimes 
preferable to build simpler VRML code through Prolog ’’macro” -like constructs, 
which, depending on their parameters, will expand to multiple VRML node 
instances. 
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shape (Geometry) 

7, based on scoped assumptions (=>) 
default_color (Color) , 
toVrml ( 
aShape 

geometry (Geometry@{}) , 

Color 

} 

). 



sphere ; - shape ( ’ Sphere ’ ) . 
cone : - shape ( ’ Cone ’ ) . 
cylinder:- shape (’ Cylinder’) . 
box:- shape ( ’Box’ ) . 



3.3 Using Prolog Macros 

We have added a ' (backquote) notation to force evaluation of Prolog macros 
by our generator. 

group 

children @ [ 

‘ transform ( 

translation(0 ,10)4) , scale(2 ,2 ,2) ,rotation(0, 0 ,0 , 0) , 

[‘sphere, ‘cone] 

), 

‘ transform ( 

translation (5 ,0 ,0) , scale (1 ,3, 6) ,rotation(0 , 1 ,0, 1 . 5) , 

[‘box] 

) 

] 

}. 



The generated code looks as follows: 

Group { 

children [ 

Transform { 

translation 0 10 4 scale 222 rotation 0000 
children [ # aShape is a VRML 2.0 PROTO! 

aShape { geometry Sphere!} color 0.7 0.8 0.8 

} , 

aShape { geometry Cone{} color 0.6 0.3 0.9 

} 

] 

} , 

] 

} 
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Using assumptions in the VRML generator macros The following example illus- 
trates the use of a scoped implication to color blue a complete sub-object (cone) 
expressed itself as a macro. 

def saucer = 'transf orm( 
translationCO , 1,1), 
scale (0 .6, 0.2, 0.6) , 
rot at ion (0 .5, 0.5, 0,1. 5) , 

[ 

‘ sphere , 

7o this can used to propagate a color over a region 
‘ (color (blue)=>>cone) 

] 

) 

3.4 The Assumption Grammar Based VRML Generator 

Our generator is implemented as a predicate toVrml/1 which takes a Pseudo- 
VRML tree and expands it into an implicit argument DCG stream. The (sim- 
plified) code looks as follows: 

toVrml(X) : -number (X) , ! ,#X. 
toVrml(X) : -atomic (X) , ! ,#X. 

toVrml (A@B) : - ! , #indent (=) , toVrml (A) , toVrml (B) . 
toVrml (A=B) : - ! , toVrml (A) , toVrml (B) . 

toVrml (A is B) :-! ,#indent(=) , toVrml (A) ,#’1S’ , toVrml (B) . 
toVrml ({X}) : - ! #indent (+) , toVrml (X) ,#indent (-) ,#’ } ’ . 
toVrmK'X) :-! ,X. 
toVrml (X) : -is_list (X) , ! , 

# ’ [ ’ , #indent (+) , vrml_list (X) , #indent (-) , # ’ ] ’ . 
toVrml (X) : -is_conj (X) , ! , vrml_conj (X) . 
toVrml (T) : -compound(T) ,#indent(=) , vrml_compound(T) . 

A final step is performed by a simple post-processor which applies indentation 
hints and formats the actual VRML output. VRML’97 ROUTES are structurally 
similar to DCGs: they are used to thread together streams of change expressed as 
event routes. While a DGG-like notation could have been used to generate such 
ROUTE statements automatically, we have preferred to use the simpler syntax 
mapping mechanism as it is closer to what VRML programmers expect. 

4 Web Application Architectures 

4.1 GGI Script Based VRML Generation 

We have integrated our VRML generator with a BinProlog based GGI scripting 
toolkit [17]. This allows generation of dynamic VRML content returned to the 
browser as a result of a POST-method call (Fig 1). 
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CGI based VRML generation 




Fig. 1. A CGI based Architecture 




Fig. 2. Example of CGI generated dynamic VRML 
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The process of generating a VRML page with a CGI script is as follows: 

— the user clicks on an HTML form or VRML anchor 

— a CGL script is invoked by the HTTP server 

— the BinProlog based VRML generator is activated 

— the dynamic VRML page is sent back to the client 

— the browser’s VRML plugin displays the results 

A CGI based demo of our generator (see Fig 2), implementing this architec- 
ture is available at http://www.binnetcorp.com/BinProlog. 

Persistent server state can be maintained using BinProlog’s remote predicate 
calls to a persistent BinProlog server or through some higher level networking 
abstractions like mobile threads [19] or remote blackboards [6]. 



4.2 Server Side Prolog Based VRML Generation 

By using the BinNet Internet Toolkit [17] we can further simplify this CGI- 
based architecture as shown in Fig. 3). Instead of calling a CGI script each time, 
a form is sent to the server with a SUBMIT request (POST or GET) to provide 
a dynamic VRML page. We can then simply embed the VRML generator in 
the server. As in most modern Web server architectures (Apache, Microsoft, 
Sun etc.), a new BinProlog thread is spawned inside the server process for each 
request. Server state can be shared between multiple users and made persistent 
by having a backup thread periodically writing the server’s state to a Prolog 
dynamic clause file. This feature is particularly important for shared virtual 
world applications like LogiMOO [21]. 




Fig. 3. Server Side Prolog based VRML Generation Architecture 
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To summarize, the process of generating a VRML page with a generator 
embedded in the Web server is as follows: 

— the user clicks on an HTML form or VRML anchor 

— the VRML generator working as a server side Prolog thread is activated 

— the dynamic VRML page is sent back to the client 

— the browser’s VRML plugin displays the results 

The multi-stream nature of Assumption Grammars is instrumental in em- 
bedding the VRML generator in ordinary CGI script processing - which uses its 
own DCG stream to parse input. The same thing happens in the case of the 
server side use - other DCG streams are used to parse elements of the server side 
HTTP protocol as well as to generate output. Assumption Grammars based I/O 
shares some similarities with Monadic [23] I/O - as present in functional lan- 
guages like Haskell, [12]. For instance, by using Assumption Grammars we are 
able to separate the logic of sequencing the output from formatting and writing 
to the output stream. 



4.3 Combining the VRML Generator with Web Data Extraction 
and Persistent Server State 

A number of ongoing projects in our research group involve Web data extraction. 
A typical application architecture involves blackboard based agent coordination 
(Fig. 4). 



Linda Based Agent Coordination Our agent coordination mechanism is 
built on top of the popular Linda [4] coordination framework, enhanced with 
unification based pattern matching, remote execution and a set of simple client- 
server components merged together into a scalable peer-to-peer layer, forming 
a network of interconnected virtual places. The key Linda operations are the 
following: 



out(X): puts X on the server 
in(X) : waits until it can take eui object 

matching X from the server 
all(X,Xs) : reads the list Xs matching X 
currently on the server 

The presence of the all/2 collector avoids the need for backtracking over mul- 
tiple remote answers. Note that the only blocking operation is in/1. Typically, 
distributed programming with Linda coordination follows consumer-producer 
patterns with added flexibility over message-passing communication through as- 
sociative search. 
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Agent Coordination with Blackboard Constraints A natural extension 
to Linda is to enable agent threads with constraint solving for the selection of 
matching terms on the blackboard, instead of plain unification. This is imple- 
mented in Jinni [18] through the use of 2 builtins: 

Wait Jor (Term, Constraint) : waits for a Term on the blackboard, such that 
Constraint is true, and when this happens, it removes the result of the match 
from the blackboard with an in/ 1 operation. Constraint is either a single goal 
or a list of goals [G1,G2, . . ,Gn] to be executed on the server. 

Notif y_about (Term) : notifies about this term one of the blocked clients 
which has performed a wait_for(Term, Constraint) i.e. 

notif y_about (stock_offer(aol , 89) ) 

would trigger execution of a client having issued 

wait_f or (stock_offer(aol .Price) , less (Price ,90) ) . 

In a client/server Linda interaction, triggering an atomic transaction when 
data, for which a constraint holds, becomes available, would be expensive. It 
would require repeatedly taking terms out of the blackboard, through expensive 
network transfers, and put them back unless the client can verify that a con- 
straint holds. Our server side implementation checks a blackboard constraint 
only after a match occurs between new incoming data and the head of a sus- 
pended thread’s constraint checking clause, i.e. an indexing mechanism is used 
to avoid useless computations. On the other hand, a mobile client thread can 
perform all the operations atomically on the server side, using local operations 
on the server, and come back with the results. 

In a typical application (see Fig 4), a Web based data extractor fetches stock 
quotes and related historical information, which is visualized using our VRML 
generator. In particular, before committing to a stock market transaction, a vi- 
sual animation of the projected stock marked dynamics can help human users 
to quickly validate the proposed buying or selling action. Trader agents, watch- 
ing for triggers expressed as blackboard constraints run in the server process on 
separate threads. They allow expressing complex conditions for buy or sell trans- 
actions, far beyond the stop and limit transactions conventional online brokerages 
can offer. The state of the server (Web data and user agents) is made persistent 
through a separate thread which periodically saves terms on the blackboard and 
thread suspension records on the server, to Prolog dynamic clause files. 

5 Related Work 

Dynamic generation of HTML Web pages with Prolog CGI scripts has been 
pioneered by the Madrid group’s Pillow library [3]. Currently most free and 
commercial Prologs offer basic support for CGI programming. Among them the 
BinNet Internet Toolkit [17] offers extensive server and client side tools, ranging 
from a Prolog based Web server supporting SSI (Server Side Includes) written 
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Fig. 4. Prolog HTTP Server With VRML Generator, Web Data Extraction and 
Persistent State 



in Prolog, to a programmable Internet Search Engine (spider). The first de- 
scription of a Prolog term to VRML mapping (of which the one used in this 
paper is an extension) has been presented in [15]. This seems to be the model 
also followed (using different implementation techniques) by [14], which also 
pioneers the use of VRML visualization in constraint logic programming. Dy- 
namic VRML manipulation using a Java -I- EAI -|- VRML plugin -|- Prolog 
interpreter in Java architecture is part if the BinNet Jinni demos - available 
online at http://www.binnetcorp.com/Jinni). This prototypes a shared vir- 
tual world - allowing synchronized manipulation of multiple VRML objects by 
human users or Jinni based Prolog agents. Global state kept on a server ensures 
that all users see the same VRML landscape. 

An important difference between the approach described in this paper and 
[14] is the ability to use assumptions to parameterize deep components of VRML 
trees. For instance, in [14] a code transformer needs to be written to replace 
cylinders by cones in the Prolog sources of a VRML page. We can achieve the 
same effect (see subsection 3.3) by expanding a generic macro shape and then 
pass the actual value (cone or cylinder) as an intuitionistic assumption. At 
the time the macro is called, the effect of the assumption will generate either a 
cylinder or a sphere, depending on the assumption. 

The approach described in this paper also differs from [14] as we use As- 
sumption Grammars - as outlined in [15]. Among the advantages of Assumption 
Grammars over conventional DGGs: 
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— no preprocessor is needed, therefore VRML syntax is emulated simply by 
adding a few Prolog operator definitions 

— defining a reentrant macro language can be achieved more easily than with 
repeated calls to the DCG term expander 

— the use of assumptions and (scoped) implications allows injecting modifica- 
tions of colors and various other attributes in the VRML code 

On the downside, as only BinProlog [16] and Jinni [18] support Assumption 
Grammars at this time, portability of our generator is quite limited. 

6 Conclusion 

We have described the use of Prolog as a program generator for empowering 
special purpose languages without processing abilities like VRML, to support 
dynamic Web content. 

We have described two Web based architectures for generating dynamic 
VRML, supporting persistent server side and multi-user synchronization. Not 
only Prolog can be used as a macro language for building compact template files 
but typical AI components like Prolog planners can be used for realistic avatar 
movement in VRML worlds. 

Our syntax-mapping based generator mechanism allows reuse of VRML pro- 
gramming skills, while the presence of Prolog as a powerful macro language 
provides an interesting synergy between Prolog’s planning, scheduling, Internet 
data extraction ability and VRML’s Internet ready animated 3D visualization. 
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Abstract. We model any network configuration arising from the exe- 
cution of a security protocol as a soft constraint satisfaction problem 
(SCSP). We formalise the protocol goal of confidentiality as a property 
of the solution for an SCSP, hence confidentiality always holds with a 
certain security level. The policy SCSP models the network configuration 
where all admissible protocol sessions have terminated successfully, and 
an imputable SCSP models a given network configuration. Comparing 
the solutions of these two problems elicits whether the given configu- 
ration hides a confidentiality attack. We can also compare attacks and 
decide which is the most significant. The approach is demonstrated on 
the asymmetric Needham-Schroeder protocol. 



1 Introduction 

Modern computer networks are insecure in the sense that the ongoing traffic 
can be intercepted and eavesdropped by a malicious attacker, called spy below. 
Agents trying to communicate over an insecure network execute suitable security 
protocols to take advantage of the protocol goals. A major goal is confidentiality, 
which holds of a message that remains undisclosed to the spy. Failure to achieve 
the claimed goals of a protocol [AN96, Low96, LR97] has motivated a number 
of approaches to reasoning formally on security protocols (e.g. [Low95, BR97, 
Pau98, Bel99]). 

Our original contribution to formal protocol analysis is an approach to mod- 
elling any network configuration arising from the execution of a protocol as a soft 
constraint satisfaction problem (SCSP), and to detecting confidentiality attacks 
mounted by the spy in the given configuration. Also, we can establish which is 
the more significant out of a pair of attacks. 

Recall that an SCSP may be viewed as a classical constraint satisfaction 
problem (CSP) [Mac92, Wal96] where each assignment of values to variables in 
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the constraints is associated to an element taken from a partially ordered set. 
These elements can then be interpreted as levels of preference, or of certainty, etc. 
When modelling security protocols, the partially ordered set contains security 
levels. For example, the Kerberos protocol [BP98] relies on two separate sets of 
session keys — one contains the “authorisation keys” , and the other contains the 
“service keys” . Each authorisation key is used to encrypt several service keys. In 
consequence, should the spy get hold of an authorisation key, by mere decryption 
she would also discover all the associated service keys. In fact, different keys (and, 
in general, different messages) must be associated with different security levels, 
and there are confidentiality attacks of different significance. We formally develop 
this reasoning using the soft constraint framework, whereas, to our knowledge, 
confidentiality is a mere yes/no property in the existing literature. 

We demonstrate our approach on the asymmetric Needham-Schroeder pro- 
tocol [NS78]. This is one of the the most discussed security protocols for it hides 
a subtle but realistic attack that was discovered nearly two decades after the 
protocol publication [Low95]. We assume a basic familiarity with the concepts 
of encryption and decryption [QUI77, RSA76]. Encryption will be indicated by 
fat braces, so that will stand for the ciphertext obtained by encrypting 

message m under key K . Encryption is perfect when m can be recovered from 
{|to|};^ if and only if K~^ is available. In this definition, the keys K and K~^ are 
interchangeable. A major feature is that K cannot be obtained from K~^ or vice 
versa. Encryption is often not perfect when it is implemented, but Lowe’s attack 
shows that designing a protocol that achieves its claimed goals is not trivial even 
if perfect cryptography were available. 

Below, we briefly review the basics of semiring-based SCSPs (§2), and then 
describe how to use them to model attacks to security protocols (§3). Then, we 
describe the asymmetric Needham-Schroeder protocol (§4), use our approach on 
that protocol (§5), and conclude (§6). 



2 Soft Constraints 



Several formalisations of the concept of soft constraints are currently available 
[SFV95, DFP93, FW92, FL93]. In the following, we refer to one that is based 
on semirings [BMR95, BMR97, BisOl], which can be shown to generalise and 
express many of the others [BFM+96, BFM+99]. 

Let us first remind that a CSP is a tuple {V, D, C, con, def, a) where 

— y is a finite set of variables, i.e., V = {ui, . . . , u„}; 

— D is a set of values, called the domain; 

— C is a finite set of constraints, i.e., C = {ci,... ,Cm}- C is ranked, i.e. 
C = Ufe Cfe, such that c G Cfc if c involves k variables; 

— con is called the connection function and it is such that con : UkiCk - y"), 
where con{c) = (vi, . . . , Vk) is the tuple of variables involved in c G Ck', 
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— def is called the definition function and it is such that def : UkiCk - 
P{D’^)), where p{D^) is the powerset of D^, that is, all the possible subsets 
of /c-tuple in 

— a CV, and represents the distinguished variables of the problem. 

In words, function con describes which variables are involved in which constraint, 
while function def specifies which are the domain tuples permitted by the con- 
straint. The set a is used to point out the variables of interest in the given CSP, 
i.e., the variables for which we want to know the possible assignments, compat- 
ibly with all the constraints (note that other classical definitions of constraint 
problems do not have the notion of distinguished variables, and thus it is as if 
all variables are of interest). 

An example of CSP is depicted in figure 1, where variables are inside circles, 
constraints are represented by undirected arcs. Here we assume that the domain 
T> of the variables contains only elements a and b. 




<a, a> 
<a, b> 




Fig. 1. A CSP 



To transform a classical constraint into a soft one, we need to associate to 
each instantiation of its variables a value from a partially ordered set. Combining 
constraints will then have to take into account such additional values, and thus 
the formalism has also to provide suitable operations for combination (x) and 
comparison (-I-) of tuples of values and constraints. This is why this formalisation 
is based on the concept of semiring, which is just a set plus two operations. 

A semiring is a tuple {A, -k, x , 0, 1) such that: 

— A is a set and 0, 1 G A; 

^ -k is commutative, associative and 0 is its unit element; 

— X is associative, distributes over -k, 1 is its unit element and 0 is its absorbing 
element. 

A c-semiring is a semiring (A, -k, x,0,l) such that: -k is idempotent, 1 is its 
absorbing element and x is commutative. 

Let us consider the relation <5 over A such that a <s b iE a + h = h. Then 
it is possible to prove that (see [BMR97]): 

~ <s is a partial order; 

— -k and X are monotone on <s; 

— 0 is its minimum and 1 its maximum; 

“ {A^ <s) is a complete lattice and, for all a, 6 G A, a + b = lub{a, b). 
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Informally, the relation <5 gives us a way to compare (some of the) tuples of 
values and constraints. In fact, when we have a <$ b, we will say that b is better 
than a. Below, < will usually replace <s- 

A constraint system is a tuple CS = {S,T>,V) where 5 is a c-semiring, T> is 
a finite set (the domain of the variables) and V is an ordered set of variables. 

Given a semiring S — {A, +, x , 0, 1) and a constraint system CS = {S, T>, V), 
a constraint is a pair {def , con) where con C V and def : ^ Therefore, 

a constraint specifies a set of variables (the ones in con), and assigns to each 
tuple of values of these variables an element of the semiring. 

A soft constraint problem is a pair (C, con) where con C V and C is a set of 
constraints: con is the set of variables of interest for the constraint set C, which 
however may concern also variables not in con. 

Figure 2 pictures a soft CSP, with the semiring values written to the right of 
the corresponding tuples, obtained from the classical one represented in figure 1 
by using the fuzzy c-semiring [DFP93, Rut94, Sch92]: 

Sfcsp = ([Oj 1], max, min, 0, 1). 



a — > 0.9 




<a, a> — > 0.8 
<a, b> — > 0.2 
<b, a> — > 0 
<b. b> — > 0 



a — > 0.9 




Fig. 2. A fuzzy CSP 



Combining and projecting soft constraints Given two constraints ci = 
{def i, coni) and C 2 = (def2,con2), their combination c\ ® C2 is the constraint 
{def, con) defined by con = coniUcon 2 and def{t) = def i{t ico”J x def 2(1 ico))^), 
where t iy denotes the tuple of values over the variables in Y, obtained by pro- 
jecting tuple t from X to Y. In words, combining two constraints means building 
a new constraint involving all the variables of the original ones, and associating 
to each tuple of domain values for such variables a semiring element that is ob- 
tained by multiplying the elements associated by the original constraints to the 
appropriate subtuples. 

Given a constraint c = {def , con) and a subset I of V, the projection of c 
over I, written c IJ-/ is the constraint {def , con') where con' = con n I and 
def'{t') = ^t>def{t). Informally, projecting means eliminating some 

^-i-Jncon 

variables. This is done by associating to each tuple over the remaining variables 
a semiring element which is the sum of the elements associated by the original 
constraint to all the extensions of this tuple over the eliminated variables. 

In short, combination is performed via the multiplicative operation of the 
semiring, and projection via the additive operation. 
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Solutions The solution of an SCSP problem P = {C, con) is the constraint 
Sol{P) = i^C) U-cori- That is, we combine all constraints, and then project 
over the variables in con. In this way we get the constraint over con which is 
“induced” by the entire SCSP. 

For example, each solution of the fuzzy CSP of figure 2 consists of a pair 
of domain values (that is, a domain value for each of the two variables) and 
an associated semiring element. Such an element is obtained by looking at the 
smallest value for all the subtuples (as many as the constraints) forming the 
pair. For example, for tuple (a, a) (that is, a: = y = a), we have to compute the 
minimum between 0.9 (which is the value for x = a), 0.8 (which is the value for 
{x = a, y = a)) and 0.9 (which is the value for y = a). Hence, the resulting value 
for this tuple is 0.8. 



3 Using SCSPs for Protocol Analysis 

We explain here how to formalise any network configuration arising from the 
execution of a protocol as an SCSP, and define confidentiality and confidential- 
ity attacks as properties of the solution for that SCSP. The following, general 
treatment is demonstrated in §5. 



3.1 The Security Semiring 

We define the set L as to contain unknown, private, debug and public. Each of 
these elements represents a possible security level that a protocol associates to 
messages. The security levels regulate the agents’ knowledge of messages. They 
may be seen as an extension of the two- valued property of known/ unknown. For 
example, unknown will be assigned to those messages that a protocol does not 
encompass, and obviously to those messages that a given agent does not know; 
debug will be assigned to the messages that are created and then exchanged 
during a protocol session. The remaining levels are self-explanatory. 

Figure 3 defines a multiplicative operator, Xsec, and an additive one, +sec- 
Theorem 1 introduces the security semiring. 

Theorem 1 (Security Semiring). 

Ssec = {L,+sec, X sec, puhUc, unknowTi) is a c-semiring. 

Proof. There exists an isomorphism between the fuzzy c-semiring (§^) and Ssec’ 
the security levels can be mapped into the values in the range [0, 1] {unknown 
being mapped into 1, public being mapped into 0), -\~sec can be mapped into 
function max, and Xgec into function min. 

Since Xsec is idempotent, it is also the gib operator in the total order of L. 
While the current four levels will suffice to model most protocols, it is under- 
stood that more complex protocols may require additional ones, such as notice, 
warning, error, crit. The security semiring can be easily extended by upgrading 
the definitions of Xgec and -\~sec 




Soft Constraints for Security Protocol Analysis: Confidentiality 



113 



X sec 


unknown 


private 


debug 


public 


unknown 


unknown 


private 


debug 


public 


private 


private 


private 


debug 


public 


debug 


debug 


debug 


debug 


public 


public 


public 


public 


public 


public 



sec 


unknown 


private 


debug 


1 public 


unknown 


unknown 


unknown 


unknown 


unknown 


private 


unknown 


private 


private 


private 


debug 


unknown 


private 


debug 


debug 


public 


unknown 


private 


debug 


public 



Fig. 3. The definitions of Xsec and +sec 



3.2 The Network Constraint System 

A computer network can be modelled as a constraint problem over a constraint 
system CSn = {Ssec'DjV) defined as follows: 

— Ssec is the security semiring (§3.1); 

— V is an unlimited set of variables, including SPY, each representing a network 
agent; 

— I? is an unlimited set of values including the empty message {||}, all atomic 
messages, as well as all messages recursively obtained by concatenation and 
encryption. Intuitively, T> represents all agents’ possible knowledge. 

We consider an unlimited number of atomic messages, which typically are agent 
names, timestamps, nonces and cryptographic keys. Concatenation and encryp- 
tion operations can be applied an unlimited number of times. Also, each agent 
can initiate an unlimited number of protocol sessions with any other agent. 

We name CSn as network constraint system. Note that CSn does not depend 
on any protocols, for it merely portrays the topology of a computer network on 
which any protocol can be implemented. Members of V will be indicated by 
capital letters, while members of T> will be in small letters. 

3.3 The Initial SCSP 

Each security protocol V is associated with a policy that should, at least, state 
which messages are public, and which messages are private for which agents. 

It is intuitive to capture these policy rules by means of our security levels 
(§3.1). Precisely, these rules can be translated into unary constraints. For each 
agent A G V, we define a unary constraint that states the security levels of A’s 
knowledge as follows. It associates security level public to all agent names and to 
timestamps (if V uses them); level private to A’s initial secrets,^ such as keys (i.e. 

^ As opposed to the secrets created during the protocol execution. 
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Ka if V uses symmetric encryption, or Ka~^ if it uses asymmetric encryption) 
or nonces; level unknown to all remaining domain values (including, e.g., other 
agents’ initial secrets, or yl’s secrets created during the protocol execution) . 

This procedure defines what we name initial SCSP for V, which portrays a 
network where V can be executed but none of its sessions has started yet. The 
initial SCSP for V also highlights the agents’ initial knowledge, which typically 
consists of atomic messages such as agent names, timestamps, some keys and 
some nonces. 

Considerations on how official protocol specifications often fail to provide a 
satisfactory policy [BMPTOO] exceed the scope of this paper. Nevertheless, hav- 
ing to define the initial SCSP for a protocol may pinpoint unknown deficiencies 
or ambiguities in the policy. 



3.4 The Policy SCSP 

The policy for a protocol V also establishes which messages must be exchanged 
during a session between a given pair of agents. 

We extend the initial SCSP for V (§3.3) with further constraints. Each step 
of a session of V between any pair of agents can be translated into, at most, two 
constraints. Precisely, for each protocol step whereby A sends a message m to 
B, the following rules must be followed. 

i?i) If A invents a new secret n (i.e. typically a new nonce) and uses it to build 
m, then add a unary constraint on A that assigns security level debug to n, 
and level unknown to all remaining messages. 

R 2 ) Add a binary constraint between A and B that assigns security level debug 
to the tuple ({||},m), and level unknown to all other possible tuples. 

The unary constraint advanced by i?i corresponds to A’s off-line creation of 
TO, while the binary constraint stated by R 2 corresponds to A’s sending to to 
B, and B's receiving to. The two rules yield what we name policy SCSP for V. 
This SCSP formalises a network where each agent has successfully terminated 
an unlimited number of protocol sessions with every other agent, while the spy 
has performed no malicious activity. 



3.5 The Imputable SCSP 

A finite network history induced by a protocol V may be viewed as a repeated 
sequence of three steps in various order: agents’ creating, sending and receiving 
messages. However, in the real world, not all messages that are sent are then 
received by the intended recipient or received at all. This is due to the malicious 
activity of the spy. Hence, to model the configuration of the network at a certain 
point in a possible history as an SCSP, we need a variation of rule R 2 (§3.4) in 
order to allow for the malicious activity of the spy. So, for each protocol step 
whereby A sends a message to to B, we constrain the initial SCSP for V (§3.3) 
additionally, as stated by rule i?i and by the following variation of rule i? 2 . 
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R' 2 ) If C receives m, then add a binary constraint between A and C that assigns 
security level debug to the tuple ({||},to), and level unknown to all other 
possible tuples. 

Note that C could either be B (in case the spy as not interfered) or be the spy 
(no other agent can act maliciously). In particular, should the spy, in the latter 
case, also deliver m to B, the rule quoted above would apply to the triple SPY, 
m and B. Or, should the spy tamper with m obtaining m', and then deliver it 
to another agent D, then the rule would apply to the triple SPY, m' , and D. 

We name the originated soft constraint problem as imputable SCSP for the 
network of agents running protocol V. There exist an unlimited number of im- 
putable SCSPs for V, which, in particular, include both the initial SCSP for V 
and the policy SCSP for V. 

3.6 Agents’ Knowledge as a Security Entailment 

The agents’ knowledge increases dynamically while they are running a proto- 
col. This can be represented using a language like cc [Sar89] suitably extended 
to deal with soft constraints [BisOl]. The extension relies on the notion of a- 
consistency, rather than on the notion of consistency/inconsistency. We extend 
the entailment relation “h” [Sco82] , which captures from the store the constraints 
implied but not explicitly present, with four new rules. These model the oper- 
ations — message encryption, decryption, concatenation and splitting — that 
the agents may perform on the messages that they see, hence increasing their 
knowledge. We name the obtained relation security entailment. Below, function 
def is associated to a generic constraint over a generic agent A. 

Encryption 

def{mi) = vi, def{m2)=V2, def{jmil^J=v 

h defijmil^J = [v ^ sec (wi -I- sec V2)) 

Decryption 

def{mi) = vi, def{m2)=V2, def{\mil^J=v 
h def {mi) = {vi X sec v) , def {m2) = {v2 x sec v) 

Concatenation 

def{mi) = vi, def{m2)=V2, def{jmi,m2l) = v 

h def{lmi,m 2 i) = {v ^ sec (wi -I- sec V2)) 

Splitting 

def{mi) = vi, def{m2)=V2, def{jmi,m2l) = v 
h def {mi) = {vi Xsec v), def {m2) = {v2 Xsec v) 

3.7 Formalising Confidentiality 

In this section, I will denote a security level, and m a message. Moreover, given 
a security protocol, P will indicate the policy SCSP for it, and p and p' some im- 
putable SCSPs for the same protocol. We define Sol{P) U'{spf}= (conp, de/p), 
Sol{p) -IJ'{spy}= {con, def), and Sol{p') -IJ'{spy}= {con' , def'). 
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Definition 1 (^-Confidentiality). 

^-confidentiality of m in p def{m) > 1. 

If ^-confidentiality of m in p holds, then we say that m is I- confidential in p. 
Intuitively, this signifies that the spy does not know message m in the network 
configuration given by p with security level I but, rather, with a level that is 
better than 1. 

The definition becomes spurious if def{m) = public, in which case we say that 
TO has worst confidentiality in p, or if ^ = unknown. Conversely, if def{m) = 
unknown, we say that m has best confidentiality in p, as ^confidentiality holds 
for all I < unknown. If I = public, then to is certainly ^confidential in p unless 
def{m) = public. 

Definition 2 (Confidentiality Attack). 

Confidentiality attack on to in p def{m) < defpfm) 

If confidentiality attack on to in p holds, then we say that there is a confidentiality 
attack on TO in p. Intuitively, this signifies that the spy has lowered in p her 
security level for to w.r.t. that allowed by the protocol policy. Clearly, if there 
is a confidentiality attack on to in p such that def{m) = I, then to is not l- 
confidential in p. 

Confidentiality attacks can be compared as follows. If there is a confidentiality 
attack on TO in p such that def{m) = I, and a confidentiality attack on to in p' 
such that def'{m) = V and I < I', then we say that p hides a worse confidentiality 
attack on m than p' does. 

4 The Needham-Schroeder Protocol 

We present the “asymmetric” protocol due to Needham and Schroeder, which 
is based on asymmetric cryptography (e.g. RSA [RSA76]) rather than on sym- 
metric cryptography (e.g. DES [QUI77]). Each agent A is endowed with a public 
key Ka, which is known to all, and a private key Ka~^ , which should be known 
only to A. Note that limiting the knowledge of Ka~^ only to A is an assumption 
explicitly required by the protocol, rather than a property that is enforced. 

Recall that a nonce is a “number that is used only once” [NS78]. The protocol 
assumes that agents can invent truly-random nonces, so that, given a nonce N 
invented by an agent P, the probability that agents other than P guess N is 
negligible. 

The first step sees an initiator A initiate the protocol with a responder B. A 
invents a nonce Na and encrypts it along with her identity under R’s public key. 
Upon reception of that message, B decrypts it and extracts A’s nonce. Then, he 
invents a nonce Nb and encrypts it along with Na under A’s public key. When 
A receives message 2, she extract Nb and sends it back to B, encrypted under 
his public key. 

The goal of the protocol is authentication: at completion of a protocol session 
initiated by A with B, A should get evidence to have communicated with B and. 
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1 . 

2 . B^A-.\Na,Nhl^^ 

3. A^B-.\Nb\^, 

Fig. 4. The asymmetric Needham-Schroeder protocol 



likewise, B should get evidence to have communicated with A. We emphasise 
that authentication here is achieved by means of confidentiality of the nonces. 
Indeed, upon reception of Na inside message 2, A would conclude that she is 
interacting with B, the only agent who could retrieve Na from message 1, since 
Na is a truly-random nonce and encryption is perfect. In the same fashion, when 
B receives Nb inside message 3, he would conclude that A was at the other end 
of the network because Nb must have been obtained from message 2, and no-one 
but A could perform this operation. But, what happens if the spy intercepts 
some messages? 

4.1 Lowe’s Attack to the Needham-Schroeder Protocol 

Recall that security protocols are implemented as distributed concurrent pro- 
grams. Lowe discovers [Low95] that the Needham-Schroeder protocol allows the 
scenario depicted in figure 5, whereby a malicious agent C can interleave two of 
the protocol sessions, provided that some agent A initiates one session with C. 



1 . A^C ■.jNa,Al^^ 

T. B:\Na, A^^, 

2'. B^ A:\Na, 

2. A:\Na, Nbi^^ 

3. A^C: \Nbl^^ 

3'. C^B:\Nbl^, 

Fig. 5. Lowe’s attack to the Needham-Schroeder Protocol 



Note that C could be a registered user of the network, so no-one could suspect 
his tampering. Since A initiates with C, she encrypts her nonce and her identity 
under C’s public key. Once obtained these data, C initiates another session 
(indicated by the primes) with another agent B, quoting A’s data rather than 
his own. From this message, B deduces that A is trying to communicate with him. 
Therefore, B replies to A, quoting her nonce and his own, Nb. Since the entire 
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network is under C’s control, C intercepts this message before it is delivered to 
A but cannot decrypt it because encryption is perfect. So, C forwards it to A. 
The message is of the form that A was expecting, hence A extracts Nb and sends 
it to the agent with whom she had initiated the first session, C. This hinders the 
confidentiality of Nb, so C can use it to complete the session with B by issuing 
message 3’, which is of the form that B was expecting. 

As a result, B believes to have communicated with A, while A was in fact 
communicating with C. In other words, C impersonates A with B: the protocol 
has ultimately failed achieve authentication because it has failed to keep Nb 
confidential. The consequences of the attack are that B will consider every future 
messages quoting Nb as coming from A. If S is a bank and A and C are account 
holders, C could ask B to transfer money from A’s account to C’s without A 
realising it. 

Lowe proposes to quote B's identity in message 2, so that A would address 
message 3 to the agent mentioned in that message rather than to the one with 
whom she initiated the session. This upgrade would prevent the attack in figure 5 
because message 3 would be encrypted under B’s public key, so C would not 
discover Nb and could not construct message 3' to complete the session with B. 

5 Modelling Needham-Schroeder 

We demonstrate our approach to protocol analysis on the Needham-Schroeder 
protocol. Note that all SCSPs presented below are in fact suitable fragments. As 
a start, we build the initial SCSP (figure 6) and the policy SCSP for the protocol 
(omitted here). 



a — > public 
b — > public 
Ka — > public 
Kb — > public 
Ka(^-1 } — > private 



a — > public 
b — > public 
— '' Ka — > public 
Kb — > public 
Kb'^{-1} — > private 



Fig. 6. Initial SCSP for the Needham-Schroeder protocol 



Then, we build the imputable SCSP corresponding to the network configura- 
tion in which a protocol session between A and B has initiated and terminated 
without the spy’s interference (figure 7). We observe that our definition of con- 
fidentiality attack does not hold in this imputable SCSP. This confirms that, if 
the spy does not interfere with the completion of a session between A and B, 
then she does not acquire more knowledge than that allowed by the policy. In 
particular, the nonce Nb has best confidentiality in this SCSP. 

Hence, we build the imputable SCSP corresponding to Lowe’s attack (fig- 
ure 8). We add a suitable suffix to B’s nonces to distinguish which agent they 
are meant to be used with. There is a confidentiality attack on nonce Nba- by 
the security entailment, the spy has lowered her security level for Nba from 
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a — > public 
b — > public 
Ka — > public 
Kb — > public 
Ka^{-1} — > private 



<{ }, {Na, a}_{Kb}> — > debug 



a — > public 
b — > public 
Ka — > public 
Kb — > public 
Kb'^{-1} — > private 




<{}, {Nb}_Kb> — > debug 



Fig. 7. Imputable SCSP for completion of protocol session between A and B 



unknown, which was stated by the policy SCSP, to debug. Indeed, Nba is not 
even de6ug-confidential in this SCSP, but only public-confidential. 



a — > public 
b — > public 
c — > public 
Ka — > public 
Kb — > public 
Kc — > public 
Ka''{-1} — > private 



a — > public 
b — > public 
c — > public 
Ka — > public 
Kb — > public 
Kc — > public 
Kb'^j-l} — > private 




c — > public 
Ka — > public 
Nb_a — > debug Kb — > public 

Kc — > public 
Kc''{-1} — > private 



Fig. 8. Imputable SCSP corresponding to Lowe’s attack 



It is easy to verify that, if the protocol is amended as required by Lowe (§4.1), 
then the imputable SCSP corresponding to Lowe’s attack is not generated by 
our procedure for building the imputable SCSPs (§3.5). 
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6 Conclusions and Future Work 

A number of approaches for reasoning formally about security protocols are 
available (e.g. [Low95, Pau98, Bel99, BR97]). We have developed a new approach 
bases on soft constraints where confidentiality is not merely associated to a 
boolean value but to a discrete security level. This allows a finer reasoning on 
the confidentiality goals that one expects from a protocol. For example, let us 
consider the network configuration in which a message that is meant for a pair of 
agents becomes unexpectedly known to a third agent (i.e. dehug). Some protocol 
policy might address this as a confidentiality attack, while some other might not 
unless the message becomes known to all (i.e. public). Yet another policy might 
consider as a confidentiality attack the mere fact that the spy has some private 
information of her own. We formally treat those different levels of confidentiality, 
and argue that this is novel to the field of protocol verification. 

Our approach is not currently devoted to proving protocols attack-proof but, 
rather, to deciding whether a single configuration induced by a protocol is so. 
If the configuration is found to hide an attack, then our analysis also states its 
significance. Therefore, our approach might be used to complement the existing 
semi-automated ones based on model checking or theorem proving. Should these 
discover some crucial configuration, building the corresponding imputable SCSP 
would allow a deeper reasoning on that configuration by comparison with the 
policy SCSP. We expect this reasoning to be useful to the resolution of legal 
disputes where a configuration induced by a protocol is imputed as possibly 
illegal, and the judge must establish whether that configuration is admissible by 
the policy that comes with the protocol. 

Our protocol analyses shall be mechanised. We intend to implement a tool 
that takes as input a protocol specification, a protocol policy, and a configuration 
to study. The tool should build the policy SCSP for the protocol, the imputable 
SCSP for the given configuration, compare their solutions, and finally output 
“OK” or a statement of the confidentiality attacks mounted in that configuration 
with what security levels. Although computing the solution of an SCSP is in 
general NP-complete, computing those two solutions will be efficient in practice, 
once a reasonable bound is stated on the number of agents and on the number 
of protocol sessions that each agent is entitled to initiate. 

Another direction of research is developing a dynamic evolution of the initial 
SCSP by means of agents who tell [Sar89] new constraints, so to model the evo- 
lution of the network. All network configurations could now be checked against 
the policy, but, to do so, the solution should be computed of all possible im- 
putable SCSPs. This might raise the computational costs. However, since each 
imputable SCSP differs from the previous one for at most two constraints (§3.5), 
previous computation could be reused and complexity could be managed. 

Future research also includes formalising additional protocol goals such as 
agent authentication, message authenticity and session key distribution. Our 
findings, together with the research directions sketched above, support soft con- 
straint programming as a significant and highly promising approach to protocol 
verification. 
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Abstract. We show how deductive databases may be protected against 
unauthorized retrieval and update requests issued by authenticated users. 
To achieve this protection, a deductive database is expressed in an equiv- 
alent form that is guaranteed to permit only authorized actions. When 
a user poses a query Q on the protected form of a database, the user 
sees the subset of the answers for Q that they are permitted to know are 
true in the database; when a user’s update request is received, a minimal 
set of authorized changes the user is permitted to make to the database 
is performed. The authorized retrieval and update requests are specified 
using a security theory that is expressed in normal clause logic. The ap- 
proach has a number of attractive technical results associated with it, 
and can be used to protect the information in any deductive database 
that is expressed in normal clause logic. 



1 Introduction 

Deductive databases are predicted to play an increasingly important role in 
the future [17] (in their own right, as DOOD hybrids and for web applications 
[7]). With the development of XSB [20], we believe that, for the first time, 
a potentially significant candidate deductive database system exists that can 
reasonably be expected to make an impact in practice. However, in order for 
deductive databases to make an impact, it is important to consider practical 
issues such as ensuring the security of the information they contain. 

To protect a deductive database from unauthorized retrieval or update re- 
quests, our approach involves expressing the database (or the subset of it that 
needs protecting) in a form that ensures that these access requests are autho- 
rized only if a security theory that is associated with the database permits the 
access. Henceforth, we will refer to the secure form of a deductive database as a 
protected database. For each user, a protected database, together with its associ- 
ated security theory, defines the subset of the logical consequences that the user 
is permitted to know to hold in the database and defines the sets of authorized 
changes that the user is permitted to make to the database in order to satisfy a 
change request. 

Since the approach we use to protect a deductive database from unauthorized 
retrieval requests is a simpler variant of the approach we use for protecting a 
database from update requests, in the bulk of the ensuing discussion we will con- 
centrate on the latter and consider how a deductive database may be protected 
against unauthorized insert, delete and modification requests. 
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We assume that a role-based access control (RBAC) security policy [19] is 
to be used to protect a deductive database. RBAC has a number of well doc- 
umented attractions [13] and supersedes the hitherto dominant discretionary 
access control (DAC) and mandatory access control (MAC) policies that are 
now recognized to be special cases of RBAC [3] . The RBAC policy we use to 
protect a deductive database is based on the RBACi model from [19]. 

Despite its importance, the security of deductive databases has thus far re- 
ceived scant attention in the literature. In [6] and [11], modal logics are con- 
sidered for specifying confidentiality requirements only. As such, these proposals 
are limited in scope, and the languages they use and approaches they suggest for 
protecting deductive databases are not especially compatible with the methods 
of representation and computation that deductive databases typically employ. 
In contrast, we use normal clause logic [16] to define the protection on deductive 
databases precisely because this language enables a declarative specification of 
security policies to be formulated, and seamlessly incorporated into the stan- 
dard representation of a deductive database as a function-free normal theory. 
Moreover, methods of computation that are ordinarily used for query evaluation 
and constraint checking on deductive databases may be used to ensure that the 
information they contain is protected from unauthorized access requests. 

In other related work, clause form logic is used in [14] to specify a range of 
high-level access control policies for protecting computer-based “systems” . Our 
approach to the representation of security policies is similar to [14], but relates 
to deductive databases specifically. As we will see, in addition to enabling access 
requests to be evaluated on deductive databases the approach we describe also 
generates minimal sets of authorized updates that are guaranteed to satisfy a 
user’s change request. In [15], a query language is described that enables a user 
of a multilevel secure (MLS) system to reason about the beliefs held by users 
with different security clearances. In our approach RBAC policies rather than 
MLS policies are assumed to be used to protect a database, theorem-proving 
techniques are used to retrieve the information that each individual user is au- 
thorized to see (rather than being used to reason about the beliefs of others), 
and updates are considered as well as retrievals. 

The rest of this paper is organized in the following way. In Section 2, some 
basic notions in security, RBAC and deductive databases are presented. In Sec- 
tion 3, we show how an RBAC\ model may be represented as a normal clause 
theory. In Section 4, we consider how to represent a deductive database in an 
equivalent form that guarantees that the database is free from unauthorized ac- 
cess requests. In Section 5, computational issues are discussed, and examples of 
the use of the approach and the technical results that apply to it are given. In 
Section 6, we describe a number of practical matters relating to our approach. 
Finally, in Section 7, some conclusions are drawn and suggestions for further 
work are made. 
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2 Preliminaries 

A deductive database, D, consists of a finite set of ground atomic assertions 
(i.e. facts) and a finite set of deductive rules. The set of facts is referred to 
as the extensional database (EDB) and the deductive rules are referred to as 
the intensional database (IDB). In the ensuing discussion we will use EDB{D) 
{IDB{D)) to denote the extensional (intensional) part of the database D. 

To protect D from unauthorized access requests, our approach is to express D 
in an equivalent form, D*, that ensures that access requests are only possible if 
the RBACi security theory (described below) that is associated with D* specifies 
that the access is authorized. Here and henceforth, D* is used to denote an 
arbitrary protected database and AD* will be used to denote an updated state 
of D*. To denote specific instances of D* {AD*), we use Di* (ADi*) where i 
is a natural number. We also use S* to denote an arbitrary RBACi security 
theory and St* to denote a specific instance of an RBACi theory where z is a 
natural number. As we will see, theorem-proving techniques are used on D* U 
S* to determine whether or not a user’s request to perform an operation on D* 
is authorized by S* or not. 

The protected deductive databases and the RBACi security theories that 
we consider consist of a finite set of function- free normal clauses [16]. A normal 
clause takes the form: H ^ Ll,L2,...,Lm. The head of the clause, H, is an atom 
and Ll,L2,...,Lm is a conjunction of literals that constitutes the body of the 
clause. The conjunction of literals Ll,L2,...,Lm must be true (proved) in order 
for H to be true (proved) . A literal is an atomic formula or its negation; negation 
in this context is negation as failure [10], and the negation of the atom A is 
denoted by not A. A clause with an empty body is an assertion or a fact. A 
definite clause is a normal clause that has no negative literals in its body; a 
definite database (definite theory) is a finite set of definite clauses. 

Since we consider function-free theories, the only terms that appear in D* 
U S'* are constants and variables. Henceforth, we will denote the constants that 
appear in D* U S* by symbols that appear in the lowercase in literals. The vari- 
ables that appear in the literals in D* U S* will be denoted by using uppercase 
symbols (possibly subscripted) . 

Since D* U S* is expressed in normal clause logic, it follows that the well- 
founded semantics [21] may be used for the associated declarative semantics, and 
that SLG-resolution [9] may be used for the corresponding procedural semantics. 

When SLG-resolution is used with the normal clause theory Di* U Si*, a 
search forest is constructed starting from the SLG-tree with its root node la- 
beled by the goal clause Q ^ <5 [9]. ^From the soundness and (search space) 
completeness of SLG-resolution (for flounder-free computations), Q is true in 
WFM(Di* U Si*) iff there is an SLG-derivation for Q ^ Q on Di* U Si* that 
terminates with the answer clause Q ^ (where WFM(Di* U Si*) is the well- 
founded model of Di* U Si*). That is, Q € WFM(Di* U Si*) iff the body of Q 
^ Q is reduced to an empty set of literals. In contrast, Q is false in WFM(Di* U 
Si*) iff all possible derivations of Q ^ Q either finitely fail or fail infinitely due 
to positive recursion; Q has an undefined truth value in all other cases. In the 




126 



Steve Barker 



case where Q has an undefined truth value, SLG-resolution produces conditional 
answers of the form Q <— 5 viz. Q is true if 5 is true where (5 is a nonempty set 
of delayed negative literals. 

The soundness of SLG-resolution is important from a security perspective 
since it ensures that no unauthorized access request is permitted from an SLG- 
derivation on D* U S'*. The completeness of SLG-resolution is important from 
a security perspective since it implies that non-fioundering SLG-resolution is 
sufficiently strong to ensure that all authorized access requests are provable 
from D* U S* and equivalently that no authorized access request is ever denied. 

Since the only terms that are included in D* U S* are constants or vari- 
ables, it follows that D* U S* satisfies the bounded-term-size property [22]; 
SLG-resolution is guaranteed to terminate for theories that satisfy this prop- 
erty [9]. Moreover, SLG-resolution has polynomial time data complexity [22] for 
function-free normal theories. 

We assume that a security administrator (SA) is responsible for specifying 
D* U S*. We also assume that a closed policy [8] is to be used for protecting 
a deductive database. That is, a user’s permission to access information must 
be specifically authorized in a security theory; users do not automatically have 
access to information when there is no negative authorization that prohibits 
such an access (an open policy [8]). The implementation of an open policy or any 
number of hybrid (i.e. open/closed) policies for protected databases necessitates 
that only minor modifications be made to the approach we describe. 

Whilst we recognize that the session concept [19] is an important aspect 
of RBAG, we regard role activation/deactivation as being a low-level feature, 
which, though related to the issues we discuss, is not of central importance. As 
such, in the examples of the evaluation of access requests that we consider later, 
we simply assume that a user has active the set of roles that is necessary to 
perform a requested action on a protected database. 

3 Representing RBACi as a Logical Theory 

The minimum requirements of any RBAG model are that it provides means 
for specifying that permissions to perform operations on database objects are 
associated with a role, and that users are assigned to roles. The assignment of 
users and permissions to roles is represented in our approach by an SA including 
definitions of ura(U,R) and rpa(R,P,0) predicates in a normal clause theory 
that represents an RBACi model. 

In the ura(U,R) relation, the predicate name ura is shorthand for user-role 
assignment] definitions of ura are used in an RBAC\ theory to represent that 
user U is assigned to role R. Similarly, rpa(R,P,0) stands for role-permission 
assignment] definitions of rpa in an RBACi theory are used to specify that role 
R is assigned the permission to perform a P operation on a database object O. 

The ura(U,R) and rpa(R,P,0) relations are defined by a SA using normal 
clauses. For example, ura(bob,rl) ^ specifies that the user Bob is assigned to 
the role rl] rpa(rl, insert, q(V, Y,Z)) ^ V^a specifies that role rl is assigned the 




Secure Deductive Databases 



127 



permission to insert any instance of the predicate q(V,Y,Z) provided that V is 
not equal to a; rpa(X, insert, r(X,Y)) ^ specifies that all roles are permitted to 
insert any instance of the r predicate (i.e. the insert permission on r is pub- 
licly accessible); and rpa(rl , delete, s(a,Y,Z)) ^ Z < specifies that the role rl 
has the permission to delete instances of s if the first argument of s(V,Y,Z) is 
constrained to be a and Z values are less than 20. 

Additional authorizations can be straightforwardly specified by formulating 
ura or rpa definitions in terms of ura, rpa, not ura or not rpa conditions. Since 
constants and variables are used in their specification, ura and rpa definitions 
can be as specific or as general as is required. For example, to represent that 
update permissions imply read permission the following pair of clauses may be 
used: rpa(R,read,0) ^ rpa(R, insert, O); rpa(R,read,0) ^ rpa(R, delete, O). 

In addition to user-role and permission-role assignments, role hierarchies are 
the other key component of RBACi . Role hierarchies are used to represent the 
idea that “senior roles” inherit the (positive) permissions assigned to “junior” 
roles (but not conversely). 

To represent an RBACi role hierarchy in our approach, a SA uses a set 
of ground instances of a binary relation to describe the pairs of roles that are 
involved in a “seniority” relationship in the partial order (R,>) that represents 
a role hierarchy; A is a set of roles and > is a “senior to” relation. 

In more formal terms, a role Rl is senior to role R2 in a role hierarchy, RH, 
iff there is a path from Rl to R2 in RH such that Rl > R2 holds in the partial 
order describing RH. The reflexive, antisymmetric and transitive senior to rela- 
tion (i.e. >) may be defined in terms of an irrefiexive and intransitive relation 
“directly senior to” . The directly senior to relation, denoted by may be de- 
fined (since > is not dense) in the following way (where A is logical ‘and’, ^ is 
classical negation, and Ri {i G {1,2,3}) denotes an arbitrary role): 

Vi?l,R2 [i?l ^ R2 iff i?l > i?2 A i?l R2 A 

-3R3[i?l > i?3 A i?3 > i?2 A Rl yf i?3 A i?2 yf R3]] 



We use ground instances of a binary d-s predicate in an RBACi theory to 
record the pairs of roles that are involved in a “directly senior to” relationship. 
That is, the assertion d-s{ri,rj) is used to record that role is directly senior to 
the role rj in an RBACi role hierarchy. 

We also require the following set of clauses that define the senior- to rela- 
tion as the reflexive-transitive closure of the d-s relation (where is a “don’t 
care” /anonymous variable): 

senior-to(Rl,Rl) ^ d-s(Rl,-) 
senior-to(Rl,Rl) ^ d-s(-,Rl) 
senior-to(Rl,R2) ^ d-s(Rl,R2) 
senior-to(Rl,R2) ^ d-s(Rl,R3),senior-to(R3,R2) 
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The senior-to predicate is used in the definition of permitted that follows: 
permitted (U,P,0) ^ ura(U,Rl),senior-to(Rl,R2),rpa(R2,P,0) 

The permitted clause expresses that a user U is authorized to perform the P 
operation on an object O if 17 is assigned to a role R1 that is senior to the role 
R2 in an RBACi role hierarchy associated with the RBACi theory, and R2 has 
been assigned the P permission on O. That is, U has the P permission on O if 
U is assigned to a role that inherits the P permission on O. 

In implementations of our approach, senior-to should be stored as a persistent 
relation [1] that is recomputed only when changes are made to the set of d-s 
assertions in an RBACi theory; it is not computed each time permitted (U,P,0) 
is evaluated. On a point of detail, we also note that a role that is junior 
to a role rj in a role hierarchy is not specified as having more read or update 
permission than rj (integrity constraints on a security theory may be used to 
ensure that this semantic constraint is satisfied). 

The clauses defining senior-to and permitted are included in every instance 
of an RBACi theory; the application-specific d-s, ura and rpa clauses define a 
particular instance of an RBACi theory. For all “meaningful” RBACi theories, 
the application-specific d-s, ura, and rpa clauses will be acyclic [2]; permitted is 
also acyclic. Although senior-to violates the acyclicity property, it is nevertheless 
negation- free. It follows therefore that any instance of an RBACi theory is locally 
stratified and has a unique 2- valued perfect model [18] that coincides with the 
total well-founded model of the theory [9]. An important corollary of RBACi 
theories being categorical and having a total model is that these theories define 
a consistent and unambiguous set of authorizations. 



4 Protected Databases: Representational Issues 

To express a deductive database, D, in a protected form, D*, that guarantees 
that any instance of an arbitrary EDB or IDB predicate, O, contained within 
it is protected from unauthorized change requests, we use the holds relation de- 
fined by the following metalogical statement (where AD^* denotes the subset 
of AD* that the user U can retrieve from): 

holds(U, inserts, 0,D*,S*, AD*) iff 

[S* \~SLG permitted(U, insert, O) and AD^ * U S* \~slg O] 

holds(U, deletes, 0,D*,S*, AD*) iff 

fS* \~SLG permitted(U, delete, O) and AD^ * U S* \/slg O] 

That is, a request by a user U to insert O into D* will be satisfied in the 
updated state AD* of D* iff U is proved, using SLG-resolution on S*, to be 
authorized to insert O into D* and O is provable from AD^ * U S* using SLG- 
resolution. Similarly, a request by U to delete O from D* will be satisfied in 
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the updated state AD* of D* iff U is proved by SLG-resolution on S* to be 
authorized to delete O from D* and O is not provable from AD^ * U S* by 
using SLG-resolution. In our approach, the change requests “ordinary” users 
may make are for the insertion or deletion of a ground atomic instance of O that 
is either explicit in D* (i.e. in EDB{D*)) or implicit in D* (i.e. in IDB{D*)). The 
update requests issued by “ordinary” users may be satisfied only by changing 
EDB{D*); only the SA or other trusted users are permitted to make changes 
other than ground atomic insertions into and deletions from D*. 

In addition to satisfying the security specified for D*, the changes a user is 
authorized to make to D* must also satisfy the set of integrity constraints on 
D*. Unfortunately, space restrictions prevent us discussing the issue of constraint 
satisfaction in any appreciable depth in this paper. For the details of constraint 
checking on RBAG security theories we refer the interested reader to [5]. 

In the case of protection from unauthorized retrieval requests, the holds re- 
lation is defined thus: 

holds(U, reads, 0,D*,S*,D*) iff [S* \~slg permitted(U,read,0) and D* \~slg O] 

That is, U is permitted to read O (i.e. U is permitted to know that O is true) 
in D* iff Uis proved, using SLG-resolution on S*, to be authorized to know that 
O is in D* and if O is provable from D* by SLG-resolution. It should be noted 
that AD* does not appear in the definition of holds in the case of retrievals since 
D=AD* in this instance. 

The 6-ary holds predicate is appropriate for describing how insertion, dele- 
tion and retrieval should be understood in the context of protected databases. 
However, as we will see next, a ternary variant holds(U,P,0) is used in D* to 
specify when C/’s request to exercise the P permission on O is satisfied with 
respect to specific instances of D* and S*. 

4.1 Insertions and IDB Predicates 

To see what is involved in representing an IDB predicate in its protected form in 
D*, consider a clause of arbitrary complexity and arity i that defines a predicate 
(where tj, j G is a term and the set of terms may be empty), 

to wit: p(tl,t2,..,ti) ^ Al,..,Am,not Bl,..,not Bn (m > 0, n > 0). 

For an authenticated user to be permitted to insert a ground instance of 
p(tl,t2,..,ti) into D* it is necessary to ensure that for each Lg literal {q={l . .m+n)) 
in the (non-empty) body of some clause that has p(tl,t2,..,ti) in the head, Lq 
is true in AD*. This makes inserts involving IDB relations relatively straight- 
forward since the only option to insert an instance of p(tl,t2,..,ti) into D* is to 
ensure that the conjunction of all of the literals in the body of at least one clause 
with p(tl,t2,..,ti) in its head is true (provable) in AD*. Hence, to protect D* 
from the unauthorized insertion of instances of p(tl,t2,..,ti), a SA will include 
the following protected form of p(tl,t2,..,ti) in D*: 
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holds (U, inserts, p(tl,t2,..,ti)) ^ permitted(U, insert, 

holds ( U, inserts, A 1 ), ,holds(U, inserts. Am ), 

holds(U, deletes, B1 holds(U, deletes. Bn ) 

The intended reading of this clause is that: C/’s request to insert a ground 
instance of p(tl,t2,..,ti) into D* is satisfied iff U is permitted by S* to insert this 
instance of p(tl,t2,..,ti) into D* and U is permitted to make true each Aj atom 
{j={l..m)) in AD* and Aj is true in AD*, and U is authorized to make false 
each iJfc literal {k={l..n)) in AD* and Bk is false in AD*. That is, p(tl,t2,...,ti) 
is true in AD* iff for all Aj and Bk literals in the body of p(tl,t2,...,ti), U inserts 
Aj into D* (i.e. makes Aj true in AD*) and deletes Bk from D* (i.e. makes Bk 
false in AD*). 

As we have said, all requests by U to modify a ground instance of p(tl, t2,..,ti) 
are decided by establishing whether U is authorized to modify the EDB predi- 
cates in terms of which p(tl,t2,..,ti) and any other IDB predicates included in the 
body of p(tl,t2,..,ti) are defined. Hence, making each Aj (Bk) literal true (false) 
in AD* requires that EDB{D*) be changed such that Aj is true {Bk is false) in 
AD* and C/ being authorized to insert Aj into (delete Bk from) D* means C/ be- 
ing authorized to change EDB{D*) to make Aj true {Bk false) in EDB{AD*). If 
the insertion of a ground instance of p(tl,t2,..,ti) cannot be satisfied by changing 
EDB{D*) then no change can be made to D*. 



4.2 Deletions and IDB Predicates 

Matters are more complicated in the case of protecting an IDB predicate 
p{tl,t2, ...,ti) (where p{tl,t2, ...,ti) <— Al..., Am,notBl, ...,notBn) from an 
unauthorized delete request since to delete a ground instance of p(tl,t2,..,ti) 
from D* it is necessary to ensure that for every clause that includes p(tl,t2,..,ti) 
in the head of a holds clause in D*, there exists some literal Lq {q=(l..m+nj) 
in the body of the holds clause such that Lq is false in AD*. 

When there is one clause defining p(tl,t2,..,ti) in D then there are m+n 
holds(U, deletes, p(tl,t2,..,ti)) clauses in D* for the m+n conditions that appear 
in the body of the definition of p(tl,t2,..,ti) in D. That is, to protect the 
IDB clause, p(tl,t2,..,ti) <— Al,...,Am,not Bl,...,not Bn from the unauthorized 
deletion of instances of p(tl,t2,..,ti), the following protected variant of p is 
required in D*: 

holds (U, deletes, p(tl,t2,..,ti)) ^ 

permitted(U, delete, p(tl,t2,..,ti)), holds (U, deletes, A 1 ) 



holds(U, deletes, p(tl,t2,..,ti)) ^ 

permitted(U , delete, p(tl,t2,..,ti)), holds (U, deletes. Am) 
holds ( U, deletes, p(tl,t2,..,ti)) ^ 

permitted (U, delete, p(tl,t2,..,ti)), holds (U, inserts, B1 ) 
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holds (U, deletes, p(tl,t2,..,ti)) <— 

permitted (U, delete, p(tl,t2,..,ti)), holds (U, inserts. Bn ) 

When p(tl,t2,..,ti) is defined by several clauses, the delete protected clauses 
in H* must be specified such that a set of minimal changes are made to EDB{D*) 
to ensure that no clause in D* with p(tl,t2,..,ti) in the head will have a set of 
literals in its body all of which are true in AD* (having multiple clauses with 
p(tl,t2,..,ti) in the head presents no problems in the case of insertion). A simple 
translation procedure may be used to generate the delete protected form of p 
when p(tl,t2,..,ti) appears in the head oi q{q> 1) clauses in D. 

As in the case of insertion, a delete request can only be satisfied if EDB{D*) 
can be changed to satisfy the request. 

4.3 Update Protection of EDB Predicates 

To express the protection of an arbitrary EDB relation, e, of arity n 
in D*, the body of the clause that protects e from unauthorized inserts 
(deletes) will be of the form: permitted(U,insert,e(Xl,X2,...,Xn)) {permit- 
ted(U, delete, e(Xl,X2,...,Xn))) where Xi (i={l..n)) is a variable. 

The set of facts in U* appear in exactly the same form as they do in an 
ordinary, unprotected deductive database (i.e. as a set of ground atomic n-ary 
assertions of the form e(cl,c2,...,cn) where ci (i={l..n)) is a constant). To sim- 
plify their presentation, we omit the assertions in D* in the examples that follow. 

Example 1 (An update protected database) 

Consider the database, Di={s(X) ^ not q(X),r(X); r(b) where q and rare 
EDB predicates. The update protected form of D\, Di*, is: 

holds (U, inserts, s(X)) ^ permitted(U, insert, s(X)), holds (U, deletes, q(X)), 

holds ( U, inserts, r(X)) 

holds(U, deletes, s(X)) ^ permitted(U, delete, s(X)), holds (U, inserts, q(X)) 

holds(U, deletes, s(X)) ^ permitted(U, delete, s(X)), holds (U, deletes, r(X)) 

holds(U, inserts, q(X)) ^ permitted(U, insert, q(X)) 

holds(U, deletes, q(X)) ^ permitted (U, delete, q(X)) 

holds(U, inserts, r(X)) ^ permitted)!!, insert, r(X)) 

holds (U,deletes,r(X)) ^ permitted (U,delete,r(X)) o 

4.4 Read Protection 

The read protected form of p(tl,t2,..,ti) ^ Al,...,Am,not Bl,...,not Bn in D is: 

holds) U, reads, p( tl,t2,..,ti)) ^ permitted)!!, read,p( tl,t2,..,ti)), 

holds) U, reads, Al), ,holds(U, reads. Am ), 

not holds(U,reads,Bl),....,not holds)!!, reads, Bn) 
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That is, an instance of holds(U,reads,p(tl,t2,...,ti)) will be true in D* iff U 
is permitted by S* to read and for all Aj literals {j G U 

is permitted by S* to know that Aj is true in D* and Aj is true in D* and for 
all Bk literals {k G {l,..,n}) U is not permitted by S'* to know that Bk is true in 
D* or Bk is false in D*\ in either of these cases we take not holds (U, reads, Bk) 
to be true in C/’s view of D*. It follows from this that we make no distinction 
between standard failure and failures due to a lack of permission. For example, 
if D 2 ={p ^ not q; q and U has read permission on p and not q then we 
regard p as true in D 2 * as far as U is concerned since although q is true in D 2 *, 
q is false in C/’s view of D 2 * since U is not authorized to know that q is true in 
the database. 

The read protected form of an n-ary EDB predicate e is as follows (where 
Xi, i={l..n), is a variable): 

holds(U, reads, e(Xl ,X2,..,Xn)) ^ 

permitted(U ,read,e(Xl ,X2, . . . ,Xn)) ,e(Xl ,X2,. . . ,Xn) 

5 Protected Databases: Computational Issues 

Since every protected database, D*, and RBAC\ security theory. S'*, is expressed 
in normal clause logic, it follows that SLG-resolution may be used to evaluate 
access requests (i.e. holds queries) on D* U S*. In fact, the update protected 
form of D* is always a definite theory and since S* is locally stratified it follows 
that full SLG-resolution is not required for the evaluation of update requests. 

When a user U requests the insertion or deletion of an object O (a ground 
atomic formula), the goal clause that is evaluated on D* U S* using SLG- 
resolution is a ground instance of holds(U,P,0) ^ holds(U,P,0) (henceforth we 
will denote a ground instance of any predicate r by r® where G is for ground) . 
In holds (U, P, O , U is substituted with the identifier of the authenticated user 
making the change request, and P is insert or delete {P=read in the case of 
retrieval requests and in this case the O argument in a holds (U,P,0) query need 
not be ground). In implementations of our approach, an end user may either 
be required to specify a holds goal clause directly on D* or a holds(U,P,0) goal 
clause may be set up by a routine that generates the required form of goal clause 
from a user interface. The U value may be automatically set by extracting the 
user identifier from the user’s login or session details. 

If e is an EDB predicate and permitted(U,insert,e(Xl,X2,..,Xn))0 (resp. 
permitted(U, delete, e(Xl,X2,...,Xn))0) is an SLG-resolvent [9] in an SLG- 
derivation on D* U S'* that generates the answer clause holds(U,P,0)^ ^ 
(where 0 is a variable-free answer substitution and holds(U,P,0)^ ^ has an 
empty set of delayed literals) then e(Xl,X2,...,Xn)0 is made true (resp. false) 
in AD* by inserting (resp. physically deleting) e(Xl,X2,...,Xn)0 into (resp. 
from) EDB{D*). 

Each SLG-derivation on D* U S* that terminates with holds (U,P,0)'^ ^ 
gives a minimal set of authorized insert and delete operations that is guar- 
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anteed to satisfy the change request on D*. The required set of changes 
to EDB{D*) is identified from the conjunction of permitted(U, insert, Pi) 
and permitted(U,delete,Qj) subgoals involved in generating the answer clause 
holds(U,P,0)^ That is, if permitted(U, insert, P\) A permitted(U, insert, P 2 ) A 

... A permitted(U , insert, Pm) A permitted (U,delete,Qi) A permitted (U,delete,Q 2 ) 
A ... A permitted (U,delete,Qn) is the conjunction of permitted conditions that 
appear in an SLG-derivation on D* U S* that produces the answer clause 
holds(U,P,0)^ ^ (where P is inserts or deletes), and Pi {i G {l,..,m}) and Qj 
(j G {J,..,n}) are EBP predicates then C/’s change request on E* is authorized 
and the authorized modification that is required on E* to satisfy C/’s change re- 
quest is to insert Pi and P 2 and ... and Pm into EEB{E*) and to delete Qi and 
Q 2 and ... and Qn from EEB{E*). If the SLG-derivation for a holds (U,P,0)'^ 
goal clause on U S'* is failed then either the change request is satisfiable 
with respect to EDB{E*) but U is not authorized to make it (i.e. S* prevents U 
making the changes to E* to produce EEB{AE*)) or the modification request 
cannot be satisfied by modifying the EDB{E*) alone. 

It should be noted that, to satisfy an update request, not all of the changes 
to EDB{E*) that are identified via the set of permitted resolvents that appear in 
SLG-derivation of holds(U,P,0)^ ^ will need to be performed to generate AE*. 
Some of the required changes may be vacuously satisfied in E*. Note also that 
assuming that update permissions imply read permission, if the answer clause 
holds (U, inserts, O)'^ ^ is generated by SLG-resolution in response to the request 
by U to insert O into E* then O is true in the subset of AE* that U is authorized 
to read. Moreover, if the answer clause holds (U, deletes, O)^ ^ is generated by 
SLG-resolution in response to the request by U to delete O from E* then O 
is false in the subset of AE* that U is authorized to read. More formally we have: 

THEOREM 1: if holds (U, inserts, O)^ ^ is an answer clause by SLG-resolution 
on D* U S* then AO* U S* \~slg holds (U, reads, O)^ 

Proof: Gonsider the arbitrarily complex definition of the predicate O in 
0:0^ Al,....,Am,not Bl,...,not Bn (the case where there are multiple 
clauses with O in the head is a straightforward inductive generalization 
of the argument that follows). The insert protected form of O in B* is 
as follows: holds (U, inserts, O) ^ permitted(U, insert, 0),holds(U, inserts, Al), 
....,holds(U,inserts,Am),holds(U, deletes, Bl), ..., holds (U, deletes, Bn). The read 
protected form of O in O* is: holds (U, reads, O) ^ permitted(U,read,0), 
holds ( U, r eads, A 1 ),...., holds (U, reads. Am), not holds(U,reads,Bl), ...,not 

holds (U, reads, Bn). If holds (U, inserts, O)'^ ^ is an answer clause by SLG- 
resolution on E* U S* then S* \~slg permitted(U, insert, O)'^ . Moreover, 
yAi {i G {l,..,m}), S* \~SLG permitted(U, insert, Ai)'^ and so AE* \~slg 
Ai. Since insert permission implies read permission we also have S* \~slg 
permitted(U,read,Ai)^ and thus since AE* U S* \~slg permitted(U,read,Ai)^ 
A Ai it follows that AO* U S* \~slg holds(U, reads, Ai)'^ . Furthermore, if 
holds (U, inserts, O)'^ ^ is an answer clause by SLG-resolution on O* U S* then 
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{j S S* \~SLG permitted(U, delete, Bj)'^ and so AD* \/slg Bj. It 

follows then that AD* U S* \~slg not holds(U, reads, Bj)^ . Therefore we have, 
WAi {i G {l,..,m}) and WBj {j G {l,...,n}) in the definition of O, that AD* U S* 
^SLG holds (U, reads, Ai)^ and AD* U S'* \~slg not holds (U, reads, Bj)'^ and so 
AD* U S* \~SLG holds (U, reads, O )^ . o 

THEOREM 2: if holds (U, deletes, O)^ ^ is an answer clause by SLG-resolution 
on D* U S* then AD* U I/slg holds (U, reads, O)^ 

Proof: Consider the arbitrarily complex definition of the predicate O in D: O *— 
Al,....,Am,not Bl,...,not Bn (the case where there are multiple clauses with O 
in the head is a straightforward inductive generalization of the argument that 
follows). The delete protected form of Oin D* is as follows (where ; is used to sep- 
arate clauses): holds (U, deletes, O) ^ permitted(U, delete, 0),holds(U, deletes, Al); 

holds(U, deletes, O) ^ permitted(U, delete, 0),holds(U, deletes, Am); 

holds (U, deletes, O) ^ permitted)!!, delete, 0),holds(U,inserts,Bl); 
holds (U, deletes, O) ^ permitted)!!, delete, 0),holds(U, inserts, Bn). The read 
protected form of O in D* is: holds )U, reads, O) ^ permitted)U,read,0), 
holds) U, reads, A 1 ),....,holds)U, reads. Am), not holds ) U, reads, B1 ),..., not 

holds)!!, reads, Bn). If holds)!!, deletes, O)'^ ^ is an answer clause by SLG- 
resolution on D* U S* then for each clause with holds)!!, deletes, O) in its head 
either 3Ai {i G {l,..,m}) such that S* \~slg permitted)!! , delete, Ai)^ and thus 
AD* \/sLG or 3Bj {j G {l,..,n}) such that S* \~slg permitted)!!, insert, Bj)^ 
and so AD* \~slg Bj. If AD* \/slg Ai then AD* \/slg holds)!!, reads, Ai)'^ 
and AD* U S* \/slg holds)!!, reads, O). If AD* hsLG Bj then since S* \~slg 
permitted)!!, insert, Bj)'^ and update permissions imply read permission we 
have AD* U S* \~slg holds )U, reads, B j)^ . But then, AD* U S* i/sLG not 
holds)!!, reads, Bj)^ . Thus, AD* U S* I/slg holds )U, reads, O )^ . o 

The key point to note from the results above are that if C/’s request to insert 
(delete) O from D* is performed then U knows that O is true (false) in AD. 
Moreover, all users that are authorized to read O in AD* know that O is true 
(false) in AD* if C/ inserts O into (deletes Ofrom) D*. Those user’s without the 
necessary read permissions will not see the update on D*. For example, if D 3 ={p 
^ not q} and U inserts g as a consequence of permitted)!! , inserts, q)^ being an 
SLG-resolvent in the generation of the answer clause holds)!!, deletes, p)^ <— on 
Z? 3 * then p is false in AD^* as far as U is concerned. However, for any user who 
is authorized to read p but not q, p is true in AD 3 *. Similarly, if U inserts r 
into D 3 * then r is in AD 3 * only for those users with the read permission on r. 

Example 2 )Insert Protection) 

Consider D\*, from Example 1, and suppose that the following security 
specification applies to it: 

Si*={ura)alice,rl) rpa)rl, delete, q)X)) 

rpa)rl,P,s)X)) ; rpa)r2,insert,r)a)) d-s)rl,r2) 
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Now, suppose that the user Alice requests the change insert s(a) on Di*. 
In this case, the answer clause holds(alice, inserts, s(a))^ is generated from 
Di* U S'!* by SLG-resolution with the following set of permitted subgoals 
as resolvents: permitted(alice, insert, s(a)), permitted(alice,delete,q(a)), permit- 
ted( alice,insert,r(a)). 

Since s is an IDB predicate, it follows that the minimal authorized change to 
EDB{Di*) that satisfies Alice’s insert request is to both delete q(a) and insert 
r(a). Since q(a) is not in EDB{Di*), only the insert of r(a) into EDB{Di*) needs 
to be performed to satisfy Alice’s change request, o 

It should be noted that any authorized changes to must satisfy the set 
of integrity constraints on D* in order to be performed. A variety of theorem- 
proving method for constraint checking (e.g. [ 12 ]) may be incorporated into our 
approach to ensure that authorized updates on D* satisfy the constraints on 
D*. Alternatively, the approach to constraint checking described in [5] may be 
used. 

Example 3 (Delete Protection) 

Suppose that Alice issues a request to delete s(b) from Hi* U S\*. In this 
case the goal clause holds (alice, deletes, s(b)) ^ holds (alice, deletes, s(b)) finitely 
fails. Hence, no authorized modification of Hi* is possible that will satisfy 
Alice’s change request. However, if ^i* were changed to S' 2 * where S' 2 *= 5 'i* 
U {rpa(r2, insert, q(X)) <— }, then the answer clause holds ( alice, ins erts, s( a) 
is generated from Hi U S' 2 * by SLG-resolution with the resolvent permit- 
ted(alice,insert,q(b)). Hence insert q(b) is an authorized modification that will 
satisfy Alice’s request to delete s(b) from Hi*. Similarly, if Si* were changed 
to S 3 * where Ss*=Si* U {rpa(r2, delete, r(X)) ^}, and S 3 * is the security 
specification on Hi* then permitted(alice, delete, r(b)) is a resolvent in the 
generation of the answer clause holds ( alice, inserts,s ( a) )^. Thus, the deletion of 
r(b) is an authorized modification that satisfies Alice’s delete request, o 

To simplify the discussion in this section, we have assumed that the clauses 
in H* contain no local variables. When clauses in H* contain local variables, 
existential quantifiers are required to specify the required changes on EDB(D*). 



6 Protected Databases: Pragmatics 

Having described our approach for representing protected databases and com- 
puting with them, in this section we consider a number of important practical 
issues that are related to the approach. 

The first point to note is that the security information in S'* is specified 
independently of H*. As such, a SA may change S* without having to change 
H* and thus user-role reassignments, changes to permission assignments, and 
changes to a role hierarchy can be easily managed. 

Although H* and S* are specified independently of each other they are in- 
extricably linked together via the permitted conditions that appear in the body 
of the clauses in H*, and the definition of permitted that is included in every 
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instance of S*. Linking D* and S* in this way has the effect of partitioning D* 
into a finite set, S, of m subsets of D*, S={Di*,D 2 *,...,Dm*}, such that each 
authenticated user is authorized to retrieve information from a single subset Di* 
{i e in if, is authorized to insert into one subset Bj* (j G {l,..,m}) 

in S, and is authorized to delete from one subset Dk* {k G in 

Whilst the transformation of a deductive database B into its protected form B* 
increases the number of clauses and conditions that have to be processed when 
evaluating access requests, the effect of this transformation, when combined with 
S*, is to define, for each user, the subsets they can retrieve from and update. 
Rather than evaluating a user’s access request over the whole database, compu- 
tation is performed with respect to the subset that the user can retrieve from, 
delete from or insert into. 

An alternative to integrating B* and S* via permitted is to use S* together 
with B in its unprotected form to determine whether access requests are autho- 
rized or not. In this case, 5* may be used after the retrieval or update request 
has been processed on B to ensure that the candidate information to be released 
to the user or updated on their behalf is authorized by S'*. Alternatively, S* may 
be used prior to accessing B in order to constrain the retrievals and updates to 
those that a user is authorized to see and make before the database is accessed. 
The former option is not especially efficient since all answers to a query or all 
candidate update transactions on B will be computed prior to using S* to de- 
termine what the user is authorized to see or update. The latter option is more 
practical, but requires that a more complex access control procedure be used 
than the one we have described. The policy of checking S* prior to accessing the 
database can be simulated in our encoding of B* by using a computation rule 
which selects permitted subgoals prior to selecting holds predicates. 

As a final point on practical matters, we suggest that it should not be too 
surprising that the size of B increases when it is transformed into B*; B* is a 
more expressive form of database than B. In addition to describing what each 
individual user of the database is authorized to insert, delete and retrieve, a 
protected database also includes a specification of the minimal sets of updates 
that are required to satisfy an authorized change request. 



7 Conclusions and Further Work 

In order for deductive databases to become important in practice, methods for 
ensuring their security must be developed. In this paper we have described an 
approach that makes it possible to specify a wide-range of RBACi security 
policies for protecting any subset of a normal clause database from unauthorized 
access requests using the methods of representation and computation that are 
ordinarily used for these databases. 

Thus far we have used our approach to implement protected relational and 
deductive databases using XSB (with role activation/deactivation treated using 
XSB's assert and retract predicates) . In future work, we will show how constraints 
on B* U S* may be suitably represented and checked. We also intend to show 
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how our previous work on temporal authorization models [4] can be incorporated 
into our representation of protected deductive databases. The other matters 
that we propose to address relate to the issue of whether U* and S* should be 
separated or integrated and the appropriateness of candidate proof methods for 
evaluating access requests in each of these cases. 
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Abstract. Programming with rewrite rules and strategies has been al- 
ready used for describing several computational logics. This paper de- 
scribes the way the Needham-Schroeder Public-Key protocol is specified 
in ELAN, the system developed in Nancy to model and compute in the 
rewriting calculus. The protocol aims to establish a mutual authenti- 
cation between an initiator and a responder that communicate via an 
insecure network. The protocol has been shown to be vulnerable and 
a correction has been proposed. The behavior of the agents and of the 
intruders as well as the security invariants the protocol should verify 
are naturally described by conditional rewrite rules whose application is 
controlled by strategies. Similar attacks to those already described in the 
literature have been discovered. We show how different strategies using 
the same set of rewrite rules can improve the efficiency in hnding the 
attacks and we compare our results to existing approaches. 

Keywords: rewriting, strategy, model-checking, authentication proto- 
cols. 



1 Introduction 

The Needham-Schroeder public-key protocol [NS78] has been already analyzed 
using several methodologies from model-checkers like FDR [Ros94] to approaches 
based on theorem proving like NRL [Mea96] . Although this protocol is described 
only by a few rules it has been proved insecure only in 1995 by G. Lowe [Low95]. 
After the discovery of the security problem and the correctness proof of a mod- 
ified version in [Low96] , several other approaches have been used to exhibit the 
attack and obtain correct versions [Mea96, Mon99, Den98]. 

The protocol aims to provide mutual authentication between two agents com- 
municating via an insecure network. The agents use public keys distributed by 
a key server in order to encrypt their messages. In this paper we consider the 
simplified version proposed by G. Lowe in [Low95] and we assume that each 
agent knows the public keys of all the other agents but it does not know their 
private keys. 

The protocol is described by defining the messages exchanged between the 
participants. Each agent sends a message (in the network) and goes into a new 
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state in which it possibly expects a confirmation message. We can thus say that 
the protocol consists of the sequence of states describing the agents and the 
communication network. Therefore it is natural to use rewrite rules in order 
to describe the transition from one state to another and strategies in order to 
describe the way these rules are applied. 

In order to describe a computational version of a certain protocol, we use 
computational systems that can express the behavior of the given protocol. A 
computational system ([KKV95]) is a combination of a set of rewrite rules and a 
strategy describing the intended set of computations. For three decades a lot of 
work has been done on rewriting and efficient implementation techniques both on 
sequential models and distributed models have been described (e.g. [KKV95]). 
These ideas are implemented in the language ELAN ([BKK+98]) which allows 
us to describe computational systems. 

In our approach the whole formalization is done at the same level: the state 
transitions of the agents and of the intruder as well as the invariants the protocol 
should satisfy are described by ELAN rewrite rules. The implementation in ELAN 
is both natural and concise and the rewrite rules describing the protocol are 
directly obtained from a classical presentation like the one presented in Section 3. 
The execution of the specification allows one, on one hand to describe attacks and 
on the other hand to certify the corrected version by exploring all the possible 
behaviors. 

Section 2 briefly presents the ELAN environment. The Needham-Schroeder 
public-key protocol is described in Section 3 together with an attack and a 
corrected version. Section 4 presents the formalization in ELAN of the Needham- 
Schroeder public-key protocol. The data structures and the rewrite rules are 
presented together with the strategies used for discovering the attack. Some op- 
timizations are proposed and some considerations related to existing approaches 
are presented. The last section of the paper contains the conclusions and give 
further perspectives for this work. 



2 The ELAN Environment 

ELAN is a language for designing and executing computational systems. In ELAN, 
a logic can be expressed by specifying its syntax and its inference rules. The syn- 
tax can be described using mixfix operators and the inference rules are described 
by conditional rewrite rules. In order to guide the application of the rewrite rules 
strategies are introduced. A description of the language and its implementation 
as well as a survey of several examples are given in [BKK+98]. 

All rewrite rules are working on equivalence classes induced by the set of 
equations E that, in the case of ELAN, is restricted to associativity and commu- 
tativity axioms, for the symbols defined to be associative-commutative. 

A labeled rewrite rule in ELAN is defined as a pair of terms built on functional 
symbols and local variables. Additionally it can be applied under some condi- 
tions and it can use some local assignments. The local assignments are let-like 
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constructions that allow applications of strategies on some terms. The general 
syntax of an ELAN rule is: 

[£] I ^ r [if cond \ where y := {S)u ]* 

and the rule is applied if the condition cond is satisfied and the strategy S is 
successfully applied on the term u. 

Several rules may have the same label, the resulting ambiguities being han- 
dled by the strategies. The rule label is optional. In this case it is the respon- 
sibility of the designer to provide a confluent and terminating set of unlabeled 
rewrite rules. 

The application of the labeled rewrite rules is controlled by user-defined 
strategies while the unlabeled rules are applied according to a default normal- 
ization strategy. The normalization strategy consists in applying unlabeled rules 
at any position of a term until the normal form is reached, this strategy being 
applied after each reduction produced by a labeled rewrite rule. 

The application of a rewrite rule in ELAN can yield several results due to the 
equational (associative-commutative) matching and to the where clauses that 
can return as well several results. 

The non-determinism is handled mainly by two strategy operators: dont 
care choose (denoted dc(si, . . . , s„)) that returns the results of at most one 
non-deterministically chosen unfailing strategy from its arguments and dont 
know choose (denoted dk(si, . . . , s„)) that returns all the possible results. The 
strategy operator repeat* applies sub-strategies in a loop until none of them 
is applicable and the operator ; is used for the sequential composition of two 
strategies. 

The environment ELAN allows us to get the trace of the computations exe- 
cuted and to obtain statistics about the application of the rewrite rules. When 
the specification describing the Needham-Schroeder public-key protocol is exe- 
cuted and an attack is discovered, the detailed description of the attack can be 
obtained by analyzing the ELAN trace of the execution. 



3 The Needham-Schroeder Public-Key Protocol 

The Needham-Schroeder public-key protocol [NS78] aims to establish a mutual 
authentication between an initiator and a responder that communicate via an 
insecure network. Each agent A possesses a public key denoted K{A) that can 
be obtained by any other agent from a key server and a (private) secret key 
that is the inverse of K{A). A message m encrypted with the public key of the 
agent A is denoted by and can be decrypted only by the owner of the 

corresponding secret key, i.e. by A. 

In this paper we only consider the simplified version proposed in [Low95] 
assuming that each agent knows at the beginning the public keys of all the other 
agents. 
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1 . A ^ B\ {Na,A}k{b) 

2 . B^A-.{Na,Nb}j,^a) 

3. A ^ B: {Nb}x(b) 

The protocol uses nonces that are fresh random numbers to be used in a 
single run of the protocol. We denote the nonce generated by the agent A by 
Na- 

The initiator A seeks to establish a session with the agent B. For this, A 
sends a message to B containing a newly generated nonce Na together with its 
identity and this message is encrypted with its key K{B). When such a message 
is received by B, the agent can decrypt it and extract the nonce Na and the 
identity of the sender. The agent B generates a new nonce Nb and sends it to 
A together with Na in a message encrypted with the public key of A. When A 
receives this response, it can decrypt the message and assumes that a session 
has been established with B. The agent A sends the nonce Nb back to B and 
when receiving this last message B assumes that a session has been established 
with A since only A could have decrypted the message containing Nb- 

The main property expected for an authentication protocol like Needham- 
Schroeder public-key protocol is to prevent an intruder from impersonating one 
of the two agents. 

The intruder is a user of the communication network and so, it can initiate 
standard sessions with the other agents and it can respond to messages sent 
by the other agents. The intruder can intercept any message from the network 
and can decrypt the messages encrypted with its key. The nonces obtained from 
the decrypted messages can be used by the intruder for generating new (fake) 
messages. The intercepted messages that can not be decrypted by the intruder 
are replayed as they are. 

An attack on the protocol is presented in [Low95] where the intruder imper- 
sonates an agent A in order to establish a session with an agent B. The attack 
involves two simultaneous runs of the protocol: one initiated by A in order to 
establish a communication with the intruder / and a second one initiated by / 
that tries to impersonate A (denoted by 1(A)) in order to establish a communi- 
cation with B. The attack involves the following steps, where I.n, II. n represent 
steps in the first and second session respectively and I (A) represents the intruder 



impersonating the 


agent A: 




I.l. 


A^ 


I 


{Na, 


II.l. 


1(A) 


B 


{Na, I^}k(b) 


II.2. 


B 


1(A) 


{Na,Nb}k(a) 


1.2. 


I ^ 


A 


{NA,NB}ii(^A) 


1.3. 


A^ 


I 




II.3. 


1(A) 


B 


{^b}k(B) 



The agent A tries to establish a session with the intruder I by sending a 
newly generated nonce Na- The intruder decrypts the message and initiates a 
second session with B but claiming to be A. The agent B responds to I with a 
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message encrypted with the key of A and the intruder intercepts it and forwards 
it to A is able to decrypt this last message and sends the appropriate response 
to I. The intruder can thus obtain the nonce JVb and sends it encrypted to B. 
At this moment B thinks that a session has been established with A while this 
session has in fact been established with the intruder. 

4 Encoding the Needham-Schroeder Protocol in ELAN 

In this section we give a description of the protocol in ELAN. The ELAN rewrite 
rules correspond to transitions of agents from one state to another after sending 
and/or receiving messages. The strategies guiding the rewrite rules describe a 
form of model-checking in which all the possible behaviors are explored. 

Although in [Low96] it has been shown that the correctness properties ob- 
tained for the protocol involving one initiator, one responder and one intruder 
can be generalized to an arbitrary number of agents and intruders we have cho- 
sen to use a variable number of agents. Therefore, the number of initiators and 
responders is not fixed in the ELAN specification and it should be given at exe- 
cution time. This allowed us on one hand to show the expressiveness of an ELAN 
specification and on the other hand to compare the results, in terms of efficiency, 
with other approaches like Murtp [DDHY92]. 

4.1 Data Structures 

The initiators and the responders are agents described by their identity, their 
state and a nonce they have created. An agent can be defined in ELAN using a 
mixfix operator: 

@ + @ + 0 : ( Agent Id SWC Nonce ) Agent; 

The symbol @ is a placeholder for terms of types Agent Id, SWC and Nonce 
respectively representing the identity, the state and the current nonce of a given 
agent. 

There are three possible values of SWC states. An agent is in the state SLEEP 
if it has neither sent nor received a request for a new session. In the state WAIT 
the agent has already sent or received a request and when reaching the state 
COMMIT the agent has established a session. 

The nonces generated in the ELAN implementation are not random numbers 
but store some information indicating the agents using the nonce. A nonce cre- 
ated by an agent A in order to communicate with an agent B is represented by 
N(A,B). Memorizing the nonce allows the agent to know at each moment who is 
the agent with whom it is establishing a session and the two identities from the 
nonce are used when verifying the invariants of the protocol. A dummy nonce is 
represented by DN or N(di,di). 

A set of agents (setAgent) is described using an associative-commutative 
(AC) operator * representing thus, in a very concise way that the order of agents 
in a set of agents is not important: 
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0 : ( Agent ) setAgent; /* an agent is a set of agents */ 

0 * 0 : ( setAgent setAgent ) setAgent (AC) ; 

nnl : setAgent; /* empty set of agents */ 

The agents exchange messages defined by: 

0 — >0:0 [0,0,0] : (Agentid Agentid Key Nonce Nonce Address)message ; 

A message of the form A — >B:K[Nl,N2,Add] is a message sent from A to B 
and contains the two nonces N1 and N2 together with the explicit address of 
the sender, Add. A dummy address is represented by DA or A(di). The header 
of the message contains the source and destination address of the message but 
since they are not encrypted they can be faked by the intruder. The body of the 
message is encrypted with the key K and can be decrypted only by the owner of 
the private key. 

As for the agents, a set of nonces (setNonce) is defined using the associative- 
commutative operator I . The communication network (network) is a set of mes- 
sages defined using the associative-commutative operator &. 

The intruder does not only participate to normal communications but can 
as well intercept and create (fake) messages. Therefore a new data structure is 
used for intruders: 

0 # 0 # 0 : ( Agentid setNonce network ) intruder; 

where the first field represents the identity of the intruder, the second one is the 
set of known nonces and the third one the set of intercepted messages. 

The ELAN rewrite rules are used to describe the modifications of the global 
state that consists of the states of all the agents involved in the communication 
and the state of the network. The global state is defined by: 

0 <> 0 <> 0 <> 0 : ( setAgent setAgent intruder network ) state; 

where the first two fields represent the set of initiators and responders, the third 
one represents the intruder and the last one the network. 

4.2 Rewrite Rules 

The rewrite rules describe the behavior of the honest agents involved in a session 
and the behavior of the intruder that tries to impersonate one of the agents. We 
will see that the invariants of the protocol are expressed by rewrite rules as well. 

The Agents. Each modification of the state of one of the participants to a 
session is described by a rewrite rule. At the beginning, all the agents are in the 
state SLEEP waiting either to initiate a session or to receive a request for a new 
session. 

When an initiator is in the state SLEEP, it can initiate a session with one of 
the responders by sending the appropriate message as defined in the first step of 
the protocol. We present the fully detailed ELAN rules describing this step but 
since the type variable declarations are very similar for the other rules, this part 
will be omitted in what follows. 
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rules for state 

x,y,w : Agentid; resp.init : Nonce; std : SWC; 

IN, re : set Agent; 1 ; setNonce; 11, net : network; 

global 

[initiator_l] 

(x+SLEEP+resp) *IN <> (y+std+init) *RE <> w#l#ll <> net => 

(x+WAIT+N(x,y))*IN <> (y+std+init) *RE <> w#l#ll <> 

X — >y:K(y) [N(x,y) ,N(di,di) ,A(x)] & net end 

end 

In the above rewrite rules x and y are variables of type Agentid representing 
the identity of the initiator and the identity of the responder respectively. The 
initiator sends a nonce N(x,y) and its address (identity) encrypted with the 
public key of the responder and goes in the state WAIT where it waits for a 
response. Since only one nonce is necessary in this message, a dummy nonce 
N(di,di) is used in the second field of the message. The message is sent by 
including it in the set of messages available on the network. 

Since the operator * is associative and commutative, when applying the 
first rule initiator.!, the initiator x is selected non-deterministically from the 
set of initiators. The destination of the message is obtained by selecting non- 
deterministically one of the agents from the set of responders. A second rule 
initiator.! considers the case where the destination of the initial message is 
the intruder instead of a responder {i.e. the sent message uses w instead of y). 

If the destination of the previously sent message is a responder in the state 
SLEEP, then this agent gets the message and decrypts it, if encrypted with its 
key. Afterwards, it sends the second message from the protocol to the initiator 
and goes in the state WAIT and waits for the final acknowledgement: 

[responder.!] IN <> (y+SLEEP+init) *RE <> w#l#ll <> 

w — >y:K(y) [N(n!,n3) ,N(n2,n4) ,A(z)l & net => 
IN <> (y+WAIT+N(y,z))*RE <> w#l#ll <> 

y — >z:K(z) [N(n!,n3) ,N(y,z) ,A(y)] & net 

One should notice that thanks to the associative-commutative definition of 
the operator &, the position of the message in the network is not important. 
A non-associative-commutative definition would have implied additional rewrite 
rules for describing the search of the appropriate message in the network. 

The condition that the message is encrypted with the public key of the re- 
sponder is implicitly tested due to the matching that should instantiate the 
variable y from the pattern y+SLEEP+init and from K(y) with the same agent 
identity. Therefore, we do not have to add an explicit condition to the rewrite 
rule that is thus kept simple and efficient. 

We have used the optimization proposed in [MMS97] and considered that the 
initiator and the responder listen only to messages coming from the intruder. 
Using the same variable w for the source of the message and for the identity 
of the intruder we can implicitly specify this property. Since all the messages 
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are intercepted by the intruder and are then forwarded to the other agents, no 
message is lost when using the above optimization. 

Two other rewrite rules, initiator_2 and responder_2, describe the other 
message exchanges from a session. When an initiator x and a responder y have 
reached the state COMMIT at the end of a correct session, the nonce N(y,x) can 
be used as a symmetric encryption key for further communications between the 
two agents. 

The Intruder. The intruder can be viewed as a normal agent that can not 
only participate in normal sessions but that also tries to break the security of 
the protocol by obtaining information that are supposed to be confidential. The 
network that serves as communication support is common to all the agents and 
therefore all the messages can be observed or intercepted and new messages can 
be inserted in it. 

Hence, the specification should describe an intruder that is able to observe 
and intercept any message in the network, decrypt messages encrypted with its 
key and store the obtained information, replay intercepted messages and gener- 
ate fake messages starting from the information it has gained. In our approach 
instead of just observing the messages, we intercept them systematically and 
replay them unchanged. 

The intruder intercepts all the messages in the network except for the mes- 
sages generated by itself and stores or decrypts them. If a message is encrypted 
with its key, the intruder decrypts it and stores the obtained nonces: 

[intruder_l] IN <> RE <> w#I#II <> 

z — >x:K(w) [N(nl,n3) ,N(n2,n4) ,A(v)] & net => 
IN <> RE <> w# N(nl,n3) |N(n2,n4) II #II <> net 

if w!=z /* not its messages */ 

If the message is not encrypted with the intruder’s key and if it has not been 
sent by the intruder, it is just stored and can be replayed later by the intruder: 

[intruder_2] 

IN <> RE <> w#I#II <> z — >x:K(y) [N(nl,n3) ,N(n2,n4) ,A(v)] & net => 
IN <> RE <> w#I#z — >x:K(y) [N(nl,n3) ,N(n2,n4) ,A(v)] & II <> net 
if w!=z // not its messages 
if w!=y // not encrypted with its key 

By checking that the source of the messages intercepted by the intruder is 
not the intruder itself we avoid the intruder cycling in a loop generate-intercept 
on the same message. 

The messages stored by the intruder are sent to all the agents without mod- 
ifying the encrypted part but specifying that the message comes from the in- 
truder. The destination of the message is selected non-deterministically from the 
union between the set of initiators and the set of responders by using the strat- 
egy extAgent that selects a random agent from the set passed to the function 
elemlA. 
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[intruder_3] 

IN <> RE <> w#l# X — >y:K(z) [N(nl,n3) ,N(n2,n4) ,A(v)] & 11 <> net => 
IN <> RE <> w#l# X— >y:K(z) [N(nl,n3) ,N(n2,n4) ,A(v)] & 11 <> 
w — >t : K(z) [N(nl ,n3) ,N(n2,n4) , A(v)] & net 
where (Agent) t+std+dn ;=(extAgent) elemIA(RE * IN) 
if not (existMess (w — >t:K(z) [N(nl,n3) ,N(n2,n4) ,A(v)] ,net)) 

The nonces previously obtained by the intruder are used in order to generate 
fake messages. These fake messages are sent to all the agents in the network and 
the intruder tries to impersonate all the agents by using a random address xadd. 

[intruder_4] IN <> RE <> w # resplinitll # 11 <> net => 

IN <> RE <> w # resplinitll # 11 <> 

w — >y;K(y) [resp,init,A(xadd)] & net 
where (Agent) y+std+dn ;=(extAgent) elemIA(RE * IN) 
where (Agent) xadd+stdl+dnl :=(extAgent) elemIA(RE * IN) 
if not (existMess (w — >y:K(y) [resp,init,A(xadd)] ,net)) 

This time both the destination of the message and the fake address in the 
encrypted part of the message are obtained with the non-deterministic strategy 
ext Agent. At each application of the latter rule a different message is generated 
and is possibly sent to a different destination. If the current message does not 
lead to an attack, a backtrack is performed and a new destination and/or address 
are selected. We go on like this until an attack is discovered or no new messages 
can be generated. This allows the intruder to generate all the possible messages 
and send them (if not already sent) to all the possible agents. 

If only one nonce is available the intruder cannot generate fakes for messages 
sent by the responder in the second step of the protocol but can still build 
messages containing only one valid nonce and this behavior is described by a 
separate rule: 

[intruder_4] IN <> RE <> w # respll # 11 <> net => 

IN <> RE <> w # respll # 11 <> 

w — >y:K(y) [resp,DN, A(xadd)] & net 
where (Agent) y+std+dn :=(extAgent) elemIA(RE * IN) 
where (Agent) xadd+stdl+dnl :=(extAgent) elemIA(RE * IN) 
if not (existMess (w — >y:K(y) [resp,init,A(xadd)] ,net)) 



The Invariants. Now, we present the invariants used to specify the correctness 
condition of the protocol. Firstly, the authenticity of the responder can be tested 
by verifying that if an initiator A committed with a responder B, then B has really 
been involved in the protocol. Instead of specifying this condition we define its 
negation that can be seen as a violation of the authenticity of the protocol. The 
rewrite rule describing the negation of the invariant checks if an initiator is in 
the state COMMIT while the corresponding responder (that is not an intruder) 
has neither committed nor sent an appropriate response. Thus, we should check 
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that there exists no responder neither in the state COMMIT nor in the state WAIT 
and waiting for an acknowledgement from the initiator. This test is performed 
by using a very simple set of unlabeled rewrite rules describing the operator 
existAg that tests the presence of an agent in a set of agents and we obtain the 
rule: 

[attack_l] (x+CDMMIT+N(x,y))*IN <> RE <> w#I#II <> net => ATTACK 
if y!=w /* not the intruder */ 

if not(existAg(y+WAIT+N(y,x) ,RE)) and 
not ( existAg (y+COMMIT+N (y , x) , RE) ) 

For the authenticity of the initiator we verify that if a responder committed 
with an initiator then the initiator has committed as well and with the appro- 
priate responder. We proceed similarly as for the first invariant and we specify 
the negation of the invariant by a rewrite rule: 

[attack_2] IN <> (y+CDMMIT+N(y,x))*RE <> w#I#II <> net => ATTACK 
if x!=w /* not the intruder */ 

if not(existAgent(x+CDMMIT+N(x,y) ,IN)) 

If one of these two rewrite rules can be applied during the execution of the 
specification then the authenticity of the protocol is not ensured and an attack 
can be described from the trace of the execution. If the rewrite rule attack_l 
can be applied we can conclude that the responder has been impersonated by 
the intruder and if the rewrite rule attack_2 can be applied we can conclude 
that the initiator has been impersonated by the intruder. 

4.3 Strategies 

The rewrite rules used to specify the behavior of the protocol and the invariants 
should be guided by a strategy describing their application. Basically, we want to 
apply repeatedly all the above rewrite rules in any order and in all the possible 
ways until one of the attack rules can be applied. 

The strategy is easy to define in ELAN by using the non-deterministic choice 
operator dk, the repeat* operator representing the repeated application of a 
strategy and the ; operator representing the sequential application of two strate- 
gies: 

[] attStrat => 

repeat*! dk( attack_l, attack_2, 

intruder_l, intruder_2, intruder_3, intruder_4, 
initiator_l, initiator_2, responder_l, responder_2 
)); attackFound 

The strategy tries to apply one of the rewrite rules given as argument to 
the dk operator starting with the rules for attacks and intruders and ending 
with the rules for the honest agents. If the application succeeds the state is 
modified accordingly and the repeat* strategy tries to apply a new rewrite rule 




148 



Horatiu Cirstea 



on the result of the rewriting. When none of the rules is applicable, the repeat* 
operator returns the result of the last successful application. Since the repeat* 
strategy is sequentially composed with the attackFound strategy, this latter 
strategy is applied on the result of the repeat* strategy. 

The strategy attackFound is nothing else but the rewrite rule: 

[attackFound] ATTACK => ATTACK end 



If an attack has not been found and therefore the strategy attackFound 
cannot be applied a backtrack is performed to the last rule applied successfully 
and another application of the respective rule is tried. If this is not possible the 
next rewrite rule is tried and if none of the rules can be applied a backtrack is 
performed to the previous successful application. 

If the result of the strategy repeat* reveals an attack, then the strategy 
attackFound can be applied and the overall strategy succeeds. The trace of the 
attack can be recovered in the ELAN environment. 

Let us consider, for example, that we have only one initiator a and one 
responder b trying to establish a session while the intruder i tries to impersonate 
one of the two agents. The initial state is represented by 

a+SLEEP+N(a, a) *nnl <> b+SLEEP+N(b,b)*nnl <> i#nl#nill <> nill 



According to the strategy attStrat the following sequence of rewrite rules 
(with the corresponding results) is applied in the repeat* loop: 



initiator_l : 
responder_l : 
initiator_2 : 
responder_2 : 



a+WAIT+N(a,b)*nnl <> b+SLEEP+N(b,b)*nnl <> 
i#nl#nill <> a — >b:K(b) [N(a,b) ,N(di,di) ,A(a)]&nill 
a+WAIT+N(a,b) *nnl <> b+WAIT+N(b,a)*nnl <> 
i#nl#nill <> b — >a:K(a) [N(a,b) ,N(b,a) ,A(b)]&nill 
a+CDMMIT+N(a,b)*nnl <> b+WAIT+N(b,a)*nnl <> 
i#nl#nill <> a— >b:K(b) [N(b,a) ,N(di,di) ,DA]&nill 
a+CDMMIT+N(a,b)*nnl <> b+COMMIT+N(b,a)*nnl <> 
i#nl#nill <> nill 



At this moment, no rewrite rule can be applied on the result of the rule 
responder_2, that represents a state in which a correct session between a and b 
has been established. Therefore, the strategy repeat* returns it as result and the 
strategy attackFound is tried without success. A backtrack is then performed 
to the last successful application (i.e. responder_2) and another rewrite rule is 
tried on the state generated by initiator_2. 

The rewrite rule intruder_2 can be executed but this application will not 
lead to an attack. Several backtracks are performed without reaching an attack 
until an alternative rewrite rule application is tried for the initial term. In our 
case the only agent in RE is b and an application of the rule using this identity has 
already been tried. Thus, the next initiator.! rule is used and the following 
application is obtained: 

initiator.!: a+WAIT+N(a, i) *nnl <> b+SLEEP+N(b,b)*nnl <> i#nl#nill 
<> a — >i :K(i) [N(a,i) ,N(di ,di) , A(a)l fenill 




Specifying Authentication Protocols Using Rewriting and Strategies 149 



Starting from this last rule application an attack is discovered and the ELAN 
trace describes exactly the attack shown in Section 3, only that to each message 
sent by the intruder corresponds, in ELAN, the interception and the generation 
of the message. 

The strategy can be slightly modified in order to try the rules for honest 
agents before the rules of the intruder: 

[] attStrat => 

repeat* ( dk( attack_l, attack_2, 

initiator_l, initiator_2, responder_l, responder_2, 
intruder_l, intruder_2, intruder_3, intruder_4 
)); attackFound 

As for the previous strategy the last step of the resulted trace is an ATTACK 
state and if we analyze the sequence of rewrite rules applied in this case we obtain 
the attack described in Section 3 but with the steps 1.2. and II. 2. merged into 
a single step: 

I+II.2. B^A ■.{NA,NB}KiA) 

The message 1.2. has been eliminated from the session between the agent A 
and the intruder / but the message II .2. is seen by the agent A as the second 
message of its session and handled accordingly. 

As we have seen, small modifications in the strategy can lead to different but 
still equivalent attacks. The strategy can influence the efficiency of finding the 
first attack and a shorter attack trace is obtained if we try the rules of a normal 
session before trying to attack the protocol. Thus, in the case of the Needham- 
Schroeder public-key protocol it is better to try the application of the rules of 
normal agents before trying the rules describing the intruder but it is difficult to 
define a general strategy scheme that would lead to the shortest attack for any 
analyzed protocol. Since the search space is explored exhaustively the strategy 
is not important if no attacks are possible on the protocol. 

In the correction shown sound in [Low96] the responder introduces its identity 
in the encrypted part of the message and the initiator checks if this identity 
corresponds to the agent it has started the session with. The second step of the 
corrected protocol becomes: 

2. B > A'. {Na, Nb, 

The ELAN specification can be easily modified in order to reflect the new 
version of the protocol containing this correction. Since in the rule responder_l 
the responder sends its identity, all we have to do is to check the correct identity 
in the rule describing the handling of the respective message initiator_2. The 
only difference w.r.t. the initial initiator_2 rule consists in adding the condi- 
tion testing the validity of the address. As expected, when the specification is 
executed with this modified rule no attacks are detected. 

The same methodology can be used for checking other properties than the 
authenticity. For example, one can check that if no intruders are present, then the 
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initiators and responders in the final state are all in the state COMMIT. Another 
property that can be tested in a similar way is the maximum number of messages 
the intruder intercepts in a run of the protocol. 

4.4 Comparing the ELAN Implementation to Existing Approaches 

The specification technique we have proposed in this paper allows us to explore 
a finite state space and we can consider it as a form of model-checking. This 
approach is well suited for detecting the attacks against the protocol and not for 
proving its correctness in an unbounded model. 

The Needham-Schroeder public-key protocol is also analyzed together with 
another two authentication protocols in [MMS97] by using a general state enu- 
meration tool, Mur(p. Several optimizations are used in order to improve the 
efficiency of finding the attack or showing that there is no attack. 

As already mentioned, the first optimization considers that the initiator and 
the responder listen only to messages coming from the intruder. Another opti- 
mization proposed in [MMS97] consists in giving a type to each message and 
using this information in order to distribute only the appropriate messages to 
the concerned initiators and responders. The two optimizations are immediately 
implemented in the ELAN specification. 

In order to reduce the number of messages sent by the intruder we have 
proposed some modifications in the rules of the intruder handling the generation 
of messages. When messages of a certain form are generated starting from the 
intercepted nonces, they are sent only to agents that might be interested in the 
respective messages. For example, the messages containing two significant nonces 
are handled only by the initiators in the second step of the protocol and thus, 
these messages are sent only to the initiators from the set IN. 

In order to compare our specification with the Murt^ approach, we have real- 
ized a similar but naive ELAN implementation that uses only syntactic operators 
and built-in lists for representing the sets of agents, messages and nonces. These 
structures are then represented using associative-commutative (AC) operators 
and an implementation similar to the one presented in this paper is obtained. 
The optimization for the intruders is then introduced leading to an optimized 
ELAN specification. The following results have been obtained for showing that 
there are no attacks in the corrected version of the protocol when variable num- 
bers of initiators, responders and messages in the network are considered: 
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1 
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298 rules 
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An approach based on rewriting logic has been proposed in [Den98]. The 
rewrite rules are used for specifying the protocol and strategies are used for 
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exploring the state space. The expressiveness of the rewrite rules used in the two 
approaches is comparable but the ELAN strategy is simpler and easier to modify. 
The descriptions of the Needham-Schroeder public-key protocol in Maude and 
ELAN are different and thus, a comparison in terms of efficiency would not be 
significant. 

5 Conclusions 

We have shown how computational systems can be used as a logical framework 
for representing the Needham-Schroeder public-key protocol. This approach can 
be easily extended to other authentication protocols and an implementation of 
the TMN protocol has already been developed. The rules describing the proto- 
col are naturally represented by conditional rewrite rules. The mixfix operators 
declared as associative-commutative allow us to express and easily handle the 
random selection of agents from a set of agents or of a message from a set of 
messages. 

As we have seen, the specification is very concise and easily modifiable. 
The rewrite rules describing the intruder are specific to the Needham-Schroeder 
public-key protocol protocol but they can be easily adapted to other authenti- 
cation protocols by simply modifying the structure of the messages sent in the 
network. 

Starting from an implementation using associative-commutative operators, 
the optimizations proposed in the Mur(p approach as well as other optimizations 
can be easily integrated by minor modifications in the initial specification. The 
performances of the ELAN implementation are comparable to the Murt^ approach 
and are significantly improved when the proposed optimization is used. 

Unlike the Murt^ specification, the rewrite rules of the ELAN implementation 
describe the behavior of the agents and of the intruder as well as the invariants to 
be verified but do not specify the parameters of the protocol, like the number of 
receivers and intruders or the size of the network. These parameters are provided 
in the query at execution time and thus, the same specification can be used for 
verifying the given protocol under different conditions. 

The strategy used for guiding the application of the rewrite rules is important 
when an attack on the protocol exists. The ELAN strategy is easily modifiable 
and we have seen that small changes can lead to different, but still equivalent, 
attacks. Additionally, different properties can be tested by slightly modifying 
the strategy and possibly replacing the rules describing the attack with other 
rules describing the corresponding property. 

The presented approach based on rewriting and model-checking explores a 
finite state space and thus, some other methods should be used in order to show 
that properties proved for the finite model can be lifted to an unbounded model. 
The proved properties can be generalized using hand-proofs like in [Low96] or 
techniques based on theorem proving. 

An obvious continuation of this work would be the use of a specification 
language like CAPSL ([DM99]) or Casper ([Low98]) for our implementation and 
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this seems fairly easy to realize due to the simplicity of our protocol description. 
Such an interface would allow us to have in the same framework an efficient 
detection of attacks using our approach and efficient proofs of the the correctness 
of the modified protocols using theorem provers. 

Acknowledgements. We sincerely thank Claude Kirchner for his help in 
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and suggestions concerning the implementation in ELAN. 
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Abstract. The goal of this project is to develop solutions to enhance 
interoperability between bioinformatics applications. Most existing ap- 
plications adopt different data formats, forcing biologists into tedious 
translation work. We propose to use of Nexus as an intermediate rep- 
resentation language. We develop a complete grammar for Nexus and 
we adopt this grammar to build a parser. The construction heavily re- 
lies on the peculiar features of Prolog, to easily derive effective parsing 
and translating procedures. We also develop a general parse tree format 
suitable for interconversion between Nexus and other formats. 



1 Introduction 

Information technology, powered by inexpensive and powerful computing re- 
sources, is increasingly becoming a critical component of research in many sci- 
entific fields. Despite this, information and computation technology has yet to 
realize its full potential, primarily due to the inability of scientists to easily 
accomplish the diversity of computational tasks required to solve high level sci- 
entific problems. For example, a typical bioinformatics analysis requires one to 
perform the following sequence of steps (as illustrated in Figure 1): 

• Retrieve the desired sequences from a public molecular sequence database 
such as GenBank or GSDB, e.g., as a result of a BLAST query. 

• Align the sequences using a program such as GLUSTAL W. 

• Search for the best phylogenetic tree depicting the relationships among se- 
quences using such programs as PHYLIP or PAUP. 

Although this sequence of tasks could be viewed as a single pipeline (see fig- 
ure) through which data flows, in practice one immediately encounters a major 
difficulty with that metaphor. These programs take input in a form not provided 
by the previous one and produce output incompatible with the next. Thus, be- 
tween each of these scientifically relevant steps, translation from one format to 
another is required in order to accomplish the main goal. Unless the user is par- 
ticularly adept at constructing shell scripts (e.g., sed and awk) or programming 
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Fig. 1. A Standard Phylogenetic Inference Task Pipeline 



in languages such as Perl and Python, this translation must be done manually, 
a tedious and error-prone procedure. Computational biologists are faced with a 
barrier when attempting to use available software tools: they must either devote 
significant energy in becoming programmers or they must spend endless hours 
editing and checking their data files when they convert from the output of one 
program to the input of another. This is a very ineffective use of scientific talent. 

Attempts have been made to develop general languages for the description 
of biological data (e.g., the efforts of the BioPerl group [2]). Of these proposals, 
the Nexus data description language [10] represents the most comprehensive 
effort to provide a universal, computer oriented language for data description 
for systematic biology applications. Nexus ideally encompasses all information a 
systematist or phylogenetic biologist may need (e.g., character data and trees). 
This makes Nexus the ideal data format for developers of new bioinformatics 
applications. In addition, the generality of Nexus makes it the the most viable 
alternative for a common intermediate data representation language, to be used 
as a communication bridge between bioinformatics applications, thus converting 
the ideal pipeline of Figure 1 into the actual pipeline depicted in Figure 2. 




Fig. 2. Modified Pipeline — using Nexus as Intermediate Language 



In the last few years Nexus has gained considerable success, but, in spite 
of this, the number of available tools to parse and manipulate Nexus files is 
limited. Existing applications which make use of Nexus (such as PAUP and 
MacClade) do not provide independent tools for parsing Nexus. Furthermore, 
since these applications focus only on certain blocks of a Nexus files, their parsing 
capabilities are limited to such blocks. Most of these limitations, as it turned out 
from our experience, derive from the complex syntactic and semantic structure 
of Nexus — including complex token definitions and a number of context-sensitive 
features — which makes the task of parsing arbitrary Nexus files challenging. 

The goal of this project is to develop general translation technology so that 
many of the important file formats can be inter-converted in a seamless way. 
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We adopt Nexus as a common data representation language. To accomplish our 
goals we provide a formalization of the grammar for Nexus, and use it to build a 
parser for the complete Nexus language. To our knowledge, this is the first time 
that this important task has been accomplished. This technology is based on 
logic programming (LP) techniques — the parsing capabilities of Prolog, includ- 
ing Definite Clause Grammars (DCGs), non-determinism, and the flexibility of 
unification and logical variables provide an excellent framework for tackling the 
complexity of Nexus (e.g., the context-sensitive components of the language). 

This project represents the first step towards a more ambitious goal: the de- 
velopment of a domain- specific language [5] for describing biological processes at 
a high level of abstraction, and for creating reliable bioinformatics applications. 
In this paper we present the development of a specification of Nexus and its 
use to construct a complete parser. We also exhibit a preliminary approach for 
inter-conversion between Nexus and the PHYLIP [3] format. 

2 Data Formats in Bioinformatics Applications 

The abundance of input/output formats in bioinformatics is witnessed by the 
fact that all the commonly used software packages produce as output and try to 
recognize as input a plethora of formats. For example, the output of a search of 
the genomic databases using BLAST (Basic Local Alignment Search Tool) [1] 
could result in many different output forms depending on the the actual BLAST 
program (BLASTP, BLASTN, etc.) used and the choice of parameters for the 
search. E.g., the histogram of expectations will be present or absent in the out- 
put of a BLAST search based on whether the histogram option was chosen or 
not. Similarly, CLUSTAL W [7] tries to automatically recognize 7 input formats 
for sequences (e.g., NBRF/PIR, EMBL/SWISSPRDT, Pearson (Pasta), CLUSTAL). 
The output of CLUSTAL W can be in one of the five formats — CLUSTAL, GCG, 
PHYLIP, NBRF/PIR, and GDE. Although, the commonly used software packages 
try to recognize inputs in several formats and try to provide outputs in format 
suitable for commonly used bioinformatics tools, most of the time it is not pos- 
sible to directly pass the output of one program as input to another program. 
The typical output of a BLAST search can not be passed directly as an input to 
CLUSTAL W and the typical output of a CLUSTAL W run cannot be passed 
without modification as input to PAUP or PHYLIP (unless the PHYLIP op- 
tion was chosen). CLUSTAL W does not provide an output option for PAUP. 
All this is unnecessarily inconvenient, restrictive and wasteful of computational 
resources as it implies that CLUSTAL W must be used more than once if dif- 
ferent Phylogenetic Inference tools (say PHYLIP and PAUP) were to be used 
on the sequence alignment produced by it although the same alignment will be 
produced each time. One could avoid reusing CLUSTAL W if one had a fast 
translator from the PHYLIP input format to PAUP input format (which is the 
Nexus format [10]). Similarly, files in PAUP format produced during previous ex- 
ecutions could be translated to PHYLIP format, to allow comparative inferences 
using different tools. This is illustrated in Figure 3. 
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Fig. 3. Avoiding Wasteful Computation via Translation 



Additionally, the fast development of the Internet has prompted the develop- 
ment of markup languages adequate to describe classes of biological data — e.g., 
BioML [14] and BlastXML [19]. This introduces the additional need of inter- 
converting between these markup languages (currently used for data exchange 
purposes) and the more traditional formats recognized by existing applications. 
One of the goals of this project is to provide the technology to build such fast 
translators quickly and reliably and use it to build the translators themselves. 

The data format design for most bioinformatics tools seems ad hoc with at 
best an informal description being available for what is acceptable as valid input 
and what output may be produced. Often, these are illustrated only via examples 
or explained informally in English. The Nexus data description language perhaps 
is the most well-documented format, but, even for Nexus, a formal description 
is not available. A formal description of the valid inputs and outputs for the 
commonly used bioinformatics tools, possibly via formal grammars, will go a 
long way towards format standardization and ease translation between formats. 

3 Approach to Formats Interoperability 

To achieve our goal, the methodology that we use should be general enough so 
that the task of format translation/conversion is facilitated. This methodology 
should be extensible, in the sense that if a new format is introduced, then the 
amount of work required to develop conversion software w.r.t. the other formats 
is minimal. To accomplish both these objectives we propose to make use of a 
common intermediate data description language, general enough to subsume all 
popularly used existing formats for biological software systems (e.g., PHYLIP, 
PAML, CLUSTAL). The advantage of following this approach is that we just 
need to build two converters for every format, F, one for converting F to the 
common format and another one to convert the common format to F. 

In the first stage of this work, we chose Nexus to be the common interme- 
diate language [10]. Nexus represents a good starting point since it has been 
designed as a data description language, independent of any specific operating 
environment or application, and general enough to encompass the most rele- 
vant data types used in bioinformatics applications. Additionally, it is provided 
with a semi-formal language description, intentionally oriented towards com- 
puter processing. Nexus, in its current format, however, is still limited in that it 
can represent biological information, but cannot represent format-related infor- 
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mation, such as those adopted in the outputs of some popularly used software 
systems (e.g., CLUSTAL W). On the other hand, the design of Nexus envisions 
the possibility of extending the language with additional features (e.g., applica- 
tion dependent components) to cover these missing capabilities. 

Construction of the Translators: Most of the existing applications which 
use the Nexus format rely on ad hoc translators, which are used to extract and 
translate small subsets of the input file. This approach is not scalable, in the 
sense that for each software package an ad hoc translator has to be developed. 
This results in a repeated effort, as the commonality among different translators 
is not exploited. General parsing of Nexus is a hard task. Two approaches can be 
considered to tackle this problem: (i) Manually develop a parser from scratch. 
This is a time consuming task and may produce unreliable parsers. Resulting 
parsers are always hard to verify, (ii) Use available tools to automatically gen- 
erate parsers from grammars (e.g., YACC). However, YACC can handle only 
restricted types of languages (those for which a LALR(l) grammar is avail- 
able). Thus, developing a complete parser using YACC is still a non-trivial task; 
there are various problems that one might encounter, including the fact that an 
LALR(l) grammar for the given language may be hard or impossible to con- 
struct. Of these approaches, the second one is considerably less complex, more 
reliable and theoretically better grounded. However, the issues discussed above 
indicate that building a parser using YACC for a non-trivial language is still a 
difficult task, outside the domain of expertise of most biologists. Furthermore, 
while neither of these approaches can be generalized to accommodate additional 
translators or language extensions (they are fundamentally “one-off” activities), 
both require the same careful analysis of the structure of the language being 
translated as the more general approach we advocate below. 

Logic Programming Technology: In this project we propose to use logic 
programming (LP) [16], and more specifically its implementation in the Pro- 
log language, for the development of translation tools between Nexus and other 
data formats. This leads to solutions that are less time-consuming, more reli- 
able, flexible and extensible. We believe that one of the reasons for a paucity of 
translators and parsers for biological notations and languages is because of the 
complexity of the traditional translation technology (e.g., we could not find a 
complete public domain parser for Nexus on the web). On the other hand, the 
Definite Clause Grammar (DCG) facility of Prolog allows one to rapidly specify 
and implement parsers. In addition to the general high-level and declarative na- 
ture of the language, there are a number of specific features of Prolog that have 
lead us to select it as development platform for this project. The use of DCGs, 
as mentioned, allows us to combine the process of creating a formal description 
of Nexus with the process of deriving a working parser. In addition, Prolog’s 
non-determinism allows one to produce (from a DCG with no extra effort) a 
non-deterministic parser. Since the languages we are dealing with contain, from 
the parsing point of view, very complex features, the use of non-determinism 
allows us to obtain a working parser without the need of modifications to the 
grammar. Furthermore, since the non-terminals in a DCG parser are predicates. 
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it is possible to add arguments to the components of the grammar allows us to 
create “communication channels” between the grammar elements, allowing to 
handle with ease the context-sensitive components of Nexus. Finally, the use of 
unification allows one to build “reversible” procedures, i.e., procedures whose 
arguments can be used either as input or as output depending on the needs. In 
the context of format translation, it is possible to develop a single procedure 
which translates between two formats and use it to perform translation in both 
directions. Although this is difficult to achieve in general, the reversibility has 
been used to avoid rebuilding parts of the translation process. 

In order to systematically develop our framework, we have taken inspiration 
from the work on Horn Logic Denotational Semantics [4]. In this framework the 
denotational semantics of a formal language (domain specific or a traditional 
one) C is given in terms of Horn clause logic. Translators can be easily obtained 
by providing the semantics of one format Li in terms of another format L 2 — 
i.e., L 2 becomes the semantic algebra used to provide the semantics of Li. The 
encoding of these semantics specifications in terms of logic programs allows one 
to automatically obtain a provably correct and working translator. 



4 System Implementation 

Description of Nexus: Nexus [10] is a data description language created to 
provide biologists practicing phylogenetic and systematic biology with a com- 
mon and application independent data format. The main principles behind the 
design of Nexus are [10]: (i) processibility: unlike other formats, the design of 
Nexus promotes (semi-)formal specification to facilitate computer processing of 
Nexus files; (ii) expandability: the structure of Nexus files is modular, allowing 
the addition of new types of data blocks without disrupting the rest of the file 
structure; (Hi) inclusivity: the choice of basic data types provided in Nexus is ex- 
pected to cover all the needs of typical phylogenetic bioinformatics applications; 
(iv) portability: while most of the existing formats have been designed to fit the 
needs of a specific application. Nexus is an application independent language; 
its specification accounts also for operability within different computing envi- 
ronments. Nexus has gained considerable popularity in recent years. A number 
of applications (e.g., PAUP 3, MacClade, COMPONENT) have adopted Nexus 
as their data format. Nexus has also been adopted as a format for data housing 
in various public databases (e.g., TreeBase, Genetic Data Analysis). 

A Nexus file, taken from [6], is shown in Fig. 5. Any file must start with the 
token #nexus and the information is organized in blocks delimited by the words 
begin and end. The name of a Nexus block, comes right after the begin word. For 
instance, in Fig. 5 there are four blocks, namely data, codons, assumptions, and 
paup. Each block is typed — i.e., it is used to describe a specific type of data (e.g., 
Taxa, Character s. Codons). In addition to these common blocks (called public 
blocks in Nexus terminology). Nexus allows also for private blocks, which are 
meant to encapsulate additional program dependent data — e.g., the MacClade 
block is used to represent additional data components required by the MacClade 
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system [11]. Nesting blocks inside another Nexus block is not allowed. Each block 
is composed by several commands and each command may have several options. 
Each command is used to specify a certain property of the collection of data 
described by the block. In Figure 5 exset and charset are commands in the 
assumptions block. Similarly, the format command in the data block has two 
options, datatype and interleave. 

A formal description of Nexus has been attempted in [10]. In spite of the 
effort, the outcome is insufficient to be directly usable in the generation of a 
parser for Nexus. The description provided in [10] is not based on formal con- 
structions (e.g., grammars) but in terms of templates and examples. As a result, 
the specification is largely incomplete and contains a number of undefined or 
partially inconsistent situations. The lack of a formal language specification at 
the time of the design of Nexus has also led to a language containing features 
which are difficult to specify and even harder to parse. 

Even though, the general structure of a Nexus file is well defined, its syntax 
has several characteristics that makes it different from the syntactic constructs 
found in programming languages. These are discussed in the rest of this section. 
Moreover the Nexus standard is quite large. To meet the goals of this project 
we have developed a complete BNF grammar for the Nexus language. This has 
been accomplished by combining the semi-formal specification presented in [10] 
with information drawn from user manuals of applications which use Nexus (e.g., 
PAUP) and data files obtained from the Web and from local biologists. Our BNF 
grammar definition of the Nexus standard [10] is around four thousand lines long, 
and this will grow bigger as Nexus standard is extended by adding more blocks or 
commands. Some of those possible extensions are already implemented in some 
new versions of MacClade [11]. Due to the size of the grammar, we show here 
just a small fragment of the BNF notation used to describe Nexus — Fig. 4 shows 
the part of the definition for the main Nexus structure. Note the similarity with 
the structure of a DCG. A complete description of the Nexus grammar can be 
found at www.cs.nmsu.edu/~epontell/nexus. 

In order to translate from Nexus to another format, our program follows the 
traditional four phases (reading input, lexical analysis, parsing, and translation). 
The system reads the input file in a list of character codes. This list is then 
scanned creating a list of tokens. After this, the list of tokens is parsed using 
a Definite Clause Grammar (DCG) to create a parse tree which is suitable for 
translating to other formats. We shall devote the remaining of this section to 
describe these phases, emphasizing the implementation techniques adopted. 

Scanning: The scanner of a Nexus file is quite unusual in many ways because 
of the syntactic and semantic properties of the tokens. The Nexus format is 
non-case sensitive. Comments in Nexus are enclosed in square brackets [ ] and 
comments can be nested. For instance, [[[valid]]] is a valid comment, while 
[ [ [invalid] ] is not. Contrary to most usual languages, comments should not be 
discarded during scanning for two reasons. First, a comment can be a comment- 
command, that is, a comment that contains a command which may be used by 
an specific program. Second, comments in a Nexus file usually contain important 
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nexus — > "#NEXUS" blocks . 

blocks — > block blocks I 

block — > "begin" block_declaration 

end — > "end" 

I "endblock" 

block_declaration — > 

block_taxa 
block_characters 
block_unaligned 
block_distances 
block_data 
block_codons 
block_sets 
block_assuinptions 
block_trees 
block_notes 
block unknown 



end 



Fig. 4. Nexus BNF fragment corresponding to the main format structure. 



scientific notes which are specific to the data, and without them the data may 
lose some relevant information. It is worthwhile to mention that the format of a 
comment-command is very general and particular to each program. Thus, what 
for some program can be a comment that can be discarded, for some other 
program might mean essential information to complete a genetic analysis. 

A token in Nexus is a sequence of characters delimited by spaces or punc- 
tuation. However, there are some exceptions: a token can include spaces and 
punctuation if enclosed in quotation, and comments do not break tokens. E.g., 
drosophila [\b] ’ aspartic ’ [\i] ’ acid’ [\p] (1) 

represents the single token ’drosophila aspartic acid’ , and the comments in 
(1) are in fact comment-commands indicating that aspartic should be displayed 
in bold, acid should be in italics, and the following tokens should be displayed us- 
ing plain letters. One more feature of Nexus is that an underscore in a non-quoted 
token is interpreted as a blank space. Therefore, DROSOPHILA_ASPARTIC_ACID and 
the token in (1) represent the same syntactic object. 

Another interesting property of the Nexus file format is that an identi- 
fier can start with a number. The scanner should be able to distinguish such 
identifiers from numbers (that can be either integers or reals). For example, 
0 . 339e-4_to_l . 23 should be recognized as a valid identifier. Finally, there is 
no strict notion about reserved-words in Nexus and most tokens have different 
meaning depending on the context. For example, begin is a token that may be 
used to delimit the beginning of a new Nexus block, but inside a block begin 
can be used as an identifier. Similarly, 2.34 in a data matrix represents either 
the value of a taxon frequency, or the symbols 2.34 corresponding to a DNA 
sequence. Besides, some single characters, like the newline, may or may not have 
a significant syntactic meaning depending on the context or according to several 
selected options in the file. 
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#NEXUS [! Primate mtDNA] 
begin data; 

dimensions ntax=12 nchar=898; 

format datatype=dna interleave gap=-; 

matrix 



AAGCTTCACCGGCGCAGTCATTCTCATAATCGCCCACGGGCTTACATCCT 

AAGCTTCACCGGCGCAATTATCCTCATAATCGCCCACGGACTTACATCCT 

AAGCTTCACCGGCGCAGTTGTTCTTATAATTGCCCACGGACTTACATCAT 

AAGCTTCACCGGCGCAACCACCCTCATGATTGCCCATGGACTCACATCCT 

AAGCTTTACAGGTGCAACCGTCCTCATAATCGCCCACGGACTAACCTCTT 

AAGCTTTTCCGGCGCAACCATCCTTATGATCGCTCACGGACTCACCTCTT 

AAGCTTTTCTGGCGCAACCATCCTCATGATTGCTCACGGACTCACCTCTT 

AAGCTTCTCCGGCGCAACCACCCTTATAATCGCCCACGGGCTCACCTCTT 

AAGCTTCTCCGGTGCAACTATCCTTATAGTTGCCCATGGACTCACCTCTT 

AAGCTTCACCGGCGCAATGATCCTAATAATCGCTCACGGGTTTACTTCGT 

AAGTTTCATTGGAGCCACCACTCTTATAATTGCCCATGGCCTCACCTCCT 

AAGCTTCATAGGAGCAACCATTCTAATAATCGCACATGGCCTTACATCAT 



Homo_sapiens 
Pan 

Gorilla 
Pongo 
Hylobates 
Macaca_fuscata 
M. _mulatta 
M. _f ascicularis 
M. _sylvanus 
Saimiri_sciureus 
Tarsius_syrichta 
Lemur_catta 
end; 

begin codons; 

codonposset * codons = 

N: 1 458-659 897 898, 

1: 2-455X3 660-894\3, 

2: 3-456X3 661-895X3, 

3: 4-457X3 662-896X3; 
codeset * codeset = mtDNA .mam . ext : all; 

end; 

begin assumptions; 

usertype ttbias (stepmatrix) = 4 
A C G T 
6 16 
6.61 
16.6 
6 16 .; 
charset tRNA_His = 459-528; 
charset ’tRNA_Ser_(AGY) ’ = 529-588; 
charset ’tRNA_Leu_(CUN) ’ = 589-659; 
charset lst_positions = 2-455X3 660-894X3 
charset 2nd_positions = 3-456X3 661-895X3 
charset 3rd_positions = 4-457X3 662-896X3 
exset protein_only = 1 458-659 897 898; 
exset non_protein = 2-457 660-896; 

end; 

begin paup; 

[Standard ML benchmark] 

outgroup Lemur_catta Tarsius_syrichta; 

set criterion=likelihood; 

Iset var=f84; 
hs ; 

end; 



[A] 

[C] 

[G] 

[T] 



Fig. 5. A Sample Nexus File 



Due to the characteristics of Nexus, just a few token recognition decisions 
can be made during the scanning process, while most of them have to be delayed 
to the parsing phase. Our approach is to recognize tokens in a general manner 
assigning in many cases more than one possible type to a recognized token. 
The result of the scanning process is a list of syntactic elements, like the one in 
Figure 6. The figure shows a formatted fragment of the list of syntactic elements 
produced by the scanner taking the file in Figure 5 as input. The token items, i.e., 
the first element of each syntactic element, is a list of character codes in real life 
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but here we present them as a single character string for the sake of readability. 
Each syntactic element is in turn a list with four items [Token, Type, Line, 
Character] . The item Token is a list of character codes representing the token. 
The second item, Type, is a list of possible types for that token. The third and 
fourth items represent the line/character coordinates of the first character of the 
token in the file. The token coordinates are used during the parsing phase to 
communicate possible warnings, errors, or other messages to the user. 

[ [’#NEXUS ’ , [word, unquoted] ,1,1] , [ [’ \n’] , [new_line] ,1,8] , 

[ ’ [ ! PrimatemtDNA] ’ , [comment , coiranent_ command] , 2 , 1] , [ ’ \n ’ , [new_line] ,2,17] , 
[’begin’ , [word, unquoted] ,3,1] , [’data’ , [word, unquoted] ,3,7] , 

[’ ; ’ , [symbol] ,3 , 11] , ... ] 



Fig. 6. Result of Lexical Analysis 



Note that in Figure 6 newline characters and comments are not discarded 
for the reasons exposed above. Also some tokens may have more than one recog- 
nized type. For instance the fourth token has two recognized types, comment 
and comment_command. During parsing and translation, the correct type ex- 
pected is matched with any of the possible types recognized in the scanning 
phase. These activities have been highly facilitated by the use of Prolog, with its 
non-deterministic search capabilities and the flexible use of logical variables. As 
an interesting example, consider again the token in (1). Recall that comments 
should not be discarded during the scanning phase. The corresponding syntactic 
element created for this token looks like: 

[[ [’Drosophila’] , [word, unquoted] ,1,1] , 

[’ [\b] ] ’ , [comment , comment_command] ,1,11] , 

[’ ’ ’ aspartic’ ’ ’ , [word, quoted] , 1, 15] , 

[’ [\i] ’ , [comment , comment_command] ,1,26] , 

[’ ’ ’ acid’ ’ ’ , [word, quoted] , 1,30] , 

[’ [\p] ’, [comment , comment_command] , 1 ,37] ], [word, comment_word] , 1 , 1] 

In this case the type [word, comment_word] is used to clarify that this token 
is a word composed of comments and words. The corresponding token item of 
this syntactic element contains a list of syntactic elements which may have to be 
combined in the parsing or the translation phase when this is required. Related to 
the issue of handling comments. Nexus allows special types of comments to affect 
the parsing process as well. For example, the use of the comment-commands [&U] 
and [&R] in a Tree block have to be recognized by the parser, since they are 
used to communicate whether the described tree is unrooted or rooted. 

Parsing: Once a list of syntactic elements is created in the scanning phase, 
parsing is performed using Prolog’s DCGs. The creation of the basic parser is 
relatively simple — as it can be obtained by translating the rule of the grammar 
specification of Nexus (e.g., the rules in Fig. 4) to DCG rules, in a straightforward 
manner. However, there are several aspects that had to be taken into account 
to create the Nexus parser. First, Nexus is a format that is widely used by 
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several programs, but it is also under continuing development. Some syntactic 
elements in original Nexus blocks have been dropped and some others have been 
added. Since many files were written using original Nexus formats, the current 
standard [10] encourages support of every version of Nexus if possible. Further 
enhancements are also planned in future Nexus releases. 

In order to meet these requirements, Prolog has been very useful in terms of 
program organization and maintenance. The Nexus parser is in fact composed of 
several parser modules, one for each different Nexus block. Each parser is written 
with an independent DCG. In this manner recognizing a new Nexus block in the 
future would consist of just adding a new parser module with a DCG capable of 
parsing the new block. Moreover, future standard modifications to a given block 
should imply only modifications to the corresponding parsing module. 

A more interesting aspect in Nexus is that it allows the introduction of new 
user-defined blocks, or even the introduction of new user-defined commands in- 
side a standard block. In this regard, the standard [10] suggests that during 
parsing, a non-recognized structure should be skipped sending a corresponding 
warning message. However, a Nexus file should be rejected with an error mes- 
sage when a grammatical structure seems to follow a standard pattern but at 
some point a syntactic problem is found that makes it impossible to recognize 
the entire file. To implement this kind of behavior, it is essential to rely on the 
use of non-determinism — letting the parser try the possible alternatives without 
commitment until a satisfactory one is found (if any). 

Further complications arise from the presence of context sensitive features 
in the structure of the data descriptions. A typical example is represented by 
the interaction between values of certain tokens acquired from the input and the 
structure of the successive input string. For example: 

Begin Data; 

DIMENSION NCHAR = 7; 

FORMAT DATATYPE = DNA MATCHAR . ; 

MATRIX 

taxon_l GACCTTA 
taxon_2 . . .C. . A 
taxon_3 . .C.T. . 

End; 

The matching character indicates that the same character in the same 
position as the previous taxon is to be used. This feature of Nexus cannot be 
directly coded as a YAGG rule (unless we use complex functions to perform 
replacements of symbols). Additionally, the fact that one is allowed to declare the 
number of characters in the sequences taxon_l, taxon_2, and taxon_3 by using 
the command DIMENSION NCHAR = 7 makes the language context sensitive. With 
respect to these last requirements, Prolog has been a valuable implementation 
tool. The embedded non-deterministic behavior of a DGG allows us to guess the 
grammatical structure of the block or the command. Additionally, if the parser 
is unable to find a standard pattern in the current grammatical structure, such 
structure is accepted as unrecognized and a warning message is sent to the user. 
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There is a final important implementation detail in the parser. A DCG as- 
sumes that the input to parse is represented as a difference list of tokens. Tokens 
in the difference list are matched against terminal symbols in a DCG rule using 
unification. However, the input we use to a DCG is a list of syntactic elements (as 
seen previously), and unification does not work any longer as a token matching 
mechanism, due to the matching requirements described earlier. 

Recall that a terminal symbol in a DCG is denoted as a list, that is, enclosing 
the terminal symbol in square brackets. For example, using a standard DCG, 
the main Nexus DCG rule in the parser would look like: 

nexus — > [’#nexus’], blocks. (1) 

In our approach, the normal terminal symbol definition is replaced by a call to 
match (?Token , ?Type , -Line , -Character , -Comments , +Begin , -End) 
where Begin and End represent a difference list of syntactic elements. For exam- 
ple, the DCG rule in (1) is in written in the parser as: 

nexus ( nexus (Blocks, Comments) ) — > 

match ( ’#nexus ’ , Type, Line, Character, Comments), blocks (Blocks) . 

In this case match succeeds if Token matches with the token in the head 
of Begin, while Type contains the type of Token recognized by the scanner. 
Comments is the list of all comments preceding Token, and the coordinates of the 
token are stored in Line, Character. All this data is usually stored in the parse 
tree and used in the translation phase to facilitate the conversion process, report 
errors, or handle the comments according to the target format specification. 

Translation: A parse tree suitable for translation is the result of the parsing 
process. In the translation phase, the parse tree is interpreted and the result of 
that interpretation is written in the corresponding target file format. The system 
is intended to be able to translate from Nexus to other several file formats so, 
as in the parser, the approach we follow is modular in the sense that a different 
Prolog module is provided for each target format supported. Therefore, adding 
support for a new target format should imply just adding a new corresponding 
Prolog module. As stated before, we expect that this modularization strategy 
shall simplify both future development and maintenance. The translator module 
is implemented in according to the strategy presented in [5,4]. The parse tree is 
interpreted in a top-down fashion using DGG rules with a format similar to the 
DGG rules used in the parser. As a simple example, the translation rule looks 
like the following: 

ph_nexus( nexus (Blocks) , Store ) :- read(Store, output_file, FileName) , 

open(FileName , write. Handle), write(Store, output _handle , Handle), 
ph_blocks (Blocks, Store), close (Handle) . 

Note the one-to-one correspondence between this translation rule and the 
parsing rule described in the previous section. The ph_ prefix is used to denote 
that this rule is a PHYLIP translation rule. Similar prefixes are planned for each 
different translator module to be supported in the future. 
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The variable Store is used to keep the computation state during the interpre- 
tation. The predicates read(+Store, +Location, -Value) and write (+Store , 
tLocation, +Value) are used respectively to get the value of some location in 
the store, or to write a value to a location in the store. The internal format of 
a store is very simple, a Store is just a list of pairs [Location, Value] . Addi- 
tional side-effect predicates, like open and close in the previous example, are 
inserted in the translation rules as necessary to accomplish the final translation. 

5 Discussion and Related Work 

All the components in our system are currently implemented in Prolog, which 
may represent both some advantages and challenges. It is well known that LR(/c)- 
parsers have important advantages over other kinds of parsers, like fast parsing 
and reduced use of stack memory. Moreover, LALR(l) parse generators and 
optimized-scanner generators are already available for other programming lan- 
guages, e.g., YACC and LEX for the C language. On the other hand, the declara- 
tive nature of Prolog usually provides a faster implementation, a small program- 
ming code, and more confidence about the correctness of the program. Further, 
the inherent nondeterminism of a DCG is particularly useful for languages where 
ambiguity or exceptional events may arise during parsing, like in Nexus. 

In our case the most time consuming part of creating a Nexus parser was the 
creation of a grammar for Nexus. The description given in the standard [10] is not 
formal at all — that may be one reason of why there is no other fully compliant 
Nexus parser (to the best of our knowledge). Our original BNF description of 
Nexus was intentionally done using a syntactical form very close to a DCG. 
Therefore, once we constructed the BNF we essentially had the parser completed. 
The only modifications to the code with respect to the BNF description is related 
to the special match routine and the extra argument required to generate the 
parse tree. The quantity of code used in the final parser is close to the size of the 
BNF description. More important is the fact that we were quite certain about 
the behavior of the parser, since the BNF description was carefully reviewed. 
Furthermore, the processing of Nexus code requires the use of a number of data 
structures (e.g., lists, trees) which are readily available in Prolog. 

With respect to the scanner generator issue, it is necessary to mention 
that there also exist publicly available scanner generators for Prolog, e.g., 
Elex(Prolog) and PLEX [15,18]. These scanner generators may not be as fast as 
a scanner generated by LEX, but they at least relieve the programmer of the 
task of writing and maintaining the scanner by hand. In particular, Elex(Prolog) 
is quite flexible and has acceptable performance, especially in cases where the 
finite automata can be defined in a deterministic way, so that the internally im- 
plemented look-ahead feature in Elex(Prolog) is not necessary. In this regard we 
should remark that the LP community would benefit from the development of an 
optimized-scanner Prolog generator like LEX, especially knowing that compilers 
and natural language processing systems are fertile areas for Prolog. 




166 Juan R. Iglesias et al. 



Some other features in Prolog provide pleasant development and run-time 
environments. Different parsers and translators specific to some parts of the 
whole grammar can be kept in separate modules facilitating future development 
and maintenance. Additionally, parsing and translation modules can be added 
on-the-fiy at run time by just loading the corresponding module. 

Finally, as a note in favor of Prolog, we should say that given the computing 
power of today’s computers, it is unlikely that the final program can run out of 
memory or run with an unacceptable time performance. We have been able to 
parse Nexus files as large as 600KB in about 15 seconds on a standard PC. 

As mentioned earlier, very limited effort has been invested in the development 
of general tools for handling Nexus files. Most of the applications which are 
making use of the Nexus format limit their attention to recognizing only specific 
parts of Nexus files, thus not providing complete parsers for the language. 

The work that comes closest to ours is represented by NCL [9], which is a 
C-|— I- Class Library for reading Nexus files. The library deals properly with the 
syntax of Nexus (including the various uses of comments), but does not have 
a complete coverage of the language (e.g., there are limitations in recognizing 
CHARACTERS blocks). The use of C-l— I- libraries offers the potential of ex- 
tending NCL to recognize new blocks (e.g., private blocks), by deriving new 
subclasses. On the other hand, the support for the creation of these new classes 
is limited (only in terms of access to the lexical analyzer) . This clearly does not 
match the simplicity of extending our system, since the grammar is explicitly 
present (as a DCG) and the structure considerably more modular. 

A Windows-based tool called NDE [12] has also been developed to create 
and edit Nexus files using a spreadsheet-type interface. The tool has the ability 
to attach user annotations (even in the form of pictures) to components of the 
file, but has limited parsing capabilities; being developed mostly as an editor to 
manually create new Nexus files, NDE places various limitations on the kind of 
Nexus files it is capable of reading. Furthermore, NDE does not provide any way 
of using its parsing capabilities for format interconversion. 

The use of LP as a technology to enhance interoperability between applica- 
tions has recently gained relevance in a number of domains. Several examples 
have been developed in the context of Web formats [13] and in the conversion 
between formats for mathematics [8]. 



6 Conclusion and Future Work 

In this paper we have described our work in constructing a system for recogniz- 
ing, parsing and translating the Nexus data representation language. Nexus is a 
universal language for the representation of data for phylogenetic and systematic 
biology applications. The task has not been easy. Nexus is a rather informally 
specified language, and, in spite of the effort of its designers, it contains a number 
of features that are hard to describe and even harder to parse. 

Our task has been accomplished by first constructing a complete grammar 
for Nexus. The grammar has allowed us to easily derive a parser for Nexus, by 
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directly translating the grammar rules into Prolog’s DCG rules. The definition 
of the language contains a rather complex definition of tokens and a number of 
context-sensitive features. Prolog’s ability to perform non-deterministic parsing, 
and the ability to use DCGs to perform context-sensitive parsing, allowed us to 
overcome these difficulties without much effort. We have also used our system to 
produce the first of a series of inter-conversion programs, allowing translation be- 
tween Nexus and other commonly used formats. Indeed, the overall objective of 
our work is to use Nexus as an intermediate language to provide interoperability 
between different data formats. At this time we have successfully experimented 
with the inter-conversion between Nexus and the PHYLIP format. 

The results described in this paper represent the first step of an ambitious 
project, called ^Log. The goal of this project is to develop a declarative, domain- 
specific language, designed to provide biologists with a convenient tool for de- 
signing phylogenetic applications (without the need of any specialized program- 
ming background) . In this first phase we have focused on the use of Nexus as 
a common data description language — in this perspective Nexus is an excel- 
lent domain-specific data description language. What is missing in Nexus is the 
ability to describe computations. Our objective is to extend Nexus by provid- 
ing it with a number of primitive operations corresponding to the most basic 
operations required during phylogenetic inference processing, and a number of 
declarative combinators to combine primitive operations and describe complete 
inference processes. This is akin to allowing high-level descriptions of pipelines 
such as the one depicted in Figure 1. The beauty of this approach is that the 
mapping between the high-level steps of the inference process and the corre- 
sponding applications in charge of executing such steps (e.g., GLUSTAL W for 
sequence alignments, PAUP for tree inference, etc.), as well as all the interoper- 
ability issues (e.g., conversions between input /output formats) will be completely 
transparent to the domain-specific language programmer (e.g., a biologist). 
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Abstract. The goal of this paper is to test if a programming method- 
ology based on the declarative language A-Prolog and the systems for 
computing answer sets of such programs, can be successfully applied to 
the development of medium size knowledge-intensive applications. We re- 
port on a successful design and development of such a system controlling 
some of the functions of the Space Shuttle. 

Keywords: answer set programming, logic programming, planning. 



1 Introduction 

The research presented in this paper is rooted in recent developments in several 
areas of AI. Advances in the work on semantics of negation in logic programming 
[12, 13] and on formalization of common-sense reasoning [25, 23] led to the 
development of the declarative language A-Prolog, used in this paper to encode 
the domain knowledge, and to an A-Prolog based methodology for representing 
defaults. Insights on the nature of causality and its relationship with answer 
sets of logic programs [14, 21, 26] determined the way we characterize effects of 
actions and solve the frame, ramification, and qualification problems which, for 
a long time, caused difficulties in modeling reasoning about dynamic domains. 
Work on propositional satisfiability influenced the development of algorithms for 
computing answer sets of A-Prolog programs and programming systems [24, 6, 5] 
implementing these algorithms. Last, but not least, we build on earlier work on 
applications of answer set programming to planning [8, 20]. 

The goal of this paper is to test if these methodologies, algorithms, and systems 
can he successfully applied to the development of medium size knowledge-intensive 
applications. We build on previous work [27, 4, 16] in which the authors developed 
a prototype of a system. Mg, capable of checking correctness of plans and finding 
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plans for the operation of the Reaction Control System (RCS) of the Space 
Shuttle. The RCS is the shuttle’s system that has primary responsibility for 
maneuvering the aircraft while it is in space. It consists of fuel and oxidizer tanks, 
valves and other plumbing needed to provide propellant to the maneuvering jets 
of the shuttle. It also includes electronic circuitry: both to control the valves in 
the fuel lines and to prepare the jets to receive firing commands. 

The RCS is computer controlled during takeoff and landing. While in orbit, how- 
ever, astronauts have the primary control. When an orbital maneuver is required, 
the astronauts must perform whatever actions are necessary to prepare the RCS. 
These actions generally require flipping switches, which are used to open or close 
valves or to energize the proper circuitry. In more extreme circumstances, such 
as a faulty switch, the astronauts communicate the problem to the ground flight 
controllers, who will come up with a sequence of computer commands to perform 
the desired task and will instruct the shuttle’s computer to execute them. 

During normal shuttle operations, there are pre-scripted plans that tell the as- 
tronauts what should be done to achieve certain goals. The situation changes 
when there are failures in the system. The number of possible sets of failures is 
too large to pre-plan for all of them. Continued correct operation of the RCS in 
such circumstances is necessary to allow for the completion of the mission and to 
help ensure the safety of the crew. An intelligent system to verify and generate 
plans would be helpful. It is within this context that this work fits. 

The system presented here, as well as the previous Mq, can be viewed as a part 
of a decision support system for shuttle flight controllers. 

In this work we expand Mg to produce a substantially more detailed model of 
the RCS. In particular, we 

1. substantially simplify the model of the part of the RCS represented by Mg 
without loss of detail, 

2. include information about electrical circuits of the RCS, which was missing 
in Mg, 

3. include a new type of action - computer commands controlling the position 
of valves, 

4. include a planning module(s) containing a large amount of heuristic infor- 
mation (this substantially improves quality of the plans and efficiency of the 
search), 

5. include a Java interface to simplify the use of the system by a flight controller 
and by the system designers. 

The resulting system, M, seems to be suitable for practical applications. The 
work on its deployment at United Space Alliance is scheduled to start in Decem- 
ber of the year 2000. 

To understand the functionality of M let us imagine a shuttle flight controller 
who is considering how to prepare the shuttle for a maneuver when faced with 
a collection of faults present in the RCS (for example, switches and valves can 
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be stuck in various positions, electrical circuits can malfunction in various ways, 
valves can be leaking, jets can be damaged, etc) . In this situation, the controller 
needs to find a sequence of actions (a plan) to set the shuttle ready for the 
maneuver. M can serve as a tool facilitating this task. The controller can use it 
to test if a plan, which he came up with manually, will actually be able to prepare 
the RCS for the desired maneuver. The system can also be used to automatically 
find such a plan. In the next section we give a brief introduction into the design 
of the system. 

2 System’s Design 

The system, M, consists of a collection of largely independent modules, repre- 
sented by Ip- functions^, and a graphical Java interface, J. The interface gives 
a simple way for the user to enter information about the history of the RCS, 
its faults, and the task to be performed. At the moment there are two possible 
types of tasks: checking if a sequence of occurrences of actions in the history of 
the system satisfies a goal, G, and finding a plan for G of a length not exceeding 
some number of steps, N. Based on this information, J verifies if the input is 
complete, selects an appropriate combination of modules, assembles them into 
an A-Prolog program, 77, and passes 77 as an input to a reasoning system for 
computing stable models (In M this role is currently played by SMODELS^, how- 
ever we also plan to investigate performance of other systems.) In this approach 
the task of checking a plan P is reduced to checking if there exists a model of the 
program 77 U P. A planning module is used to generate a set of possible plans 
the user is interested in and the correctness theorem guarantees that there is a 
one-to-one correspondence between the plans and the set of stable models of the 
program. Planning is reduced to finding such models. Finally, the Java interface 
extracts the appropriate answer from the SMODELS output and displays it in a 
user-friendly format. 

In the rest of this section we give a slightly more detailed description of particular 
modules. 



2.1 Plumbing Module 

The Plumbing Module {PM) models the plumbing system of the RCS, which 
consists of a collection of tanks, jets and pipe junctions connected through pipes. 
The flow of fluids through the pipes is controlled by valves. The system’s pur- 
pose is to deliver fuel and oxidizer from tanks to the jets needed to perform 
a maneuver. The structure of the plumbing system is described by a directed 
graph, Gr, whose nodes are tanks, jets and pipe junctions, and whose arcs are 

^ By an Ip-function we mean program 77 of A-Prolog with input and output signatures 
Ui{n) and Go{n) and a set dom{n) of sets of literals from Gi{II) such that, for any 
X e dom(n), 77 U A is consistent, i.e. has an answer set. 

^ http:/ /www. tcs.hut.fi/Software/smodels 
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labeled by valves. The possible faults of the system at this level are leaky valves, 
damaged jets, and valves stuck in some position. 

The purpose of PM is to describe how faults and changes in the position of 
valves affect the pressure of tanks, jets and junctions. In particular, when fuel 
and oxidizer flow at the right pressure from the tanks to a properly working jet, 
the jet is considered ready to fire. In order for a maneuver to be started, all the 
jets it requires must be ready to fire. The necessary condition for a fluid to flow 
from a tank to a jet, and in general to any node of Gr, is that there exists a 
path without leaks from the tank to the node and that all valves along the path 
are open. 

The rules of PM define a function which takes as input the structural description, 
Gr, of the plumbing system, its current state, including position of valves and the 
list of faulty components, and determines: the distribution of pressure through 
the nodes of Gr; which jets are ready to fire; which maneuvers are ready to be 
performed. 

To illustrate the issues involved in the construction of PM , let us consider the 
definition of fluent pressurizedJby{N,Tk), describing the pressure obtained on 
a node IV by a tank Tk. Some special nodes, the helium tanks, are always 
pressurized. For all other nodes, the definition is recursive. It says that any node 
A'^I is pressurized by a tank Tk if Nl is not leaking and is connected by an open 
valve to a node N2 which is pressurized hy Tk. 

Representation of this definition in standard Prolog is problematic, since the 
corresponding graph can contain cycles. (This fact is partially responsible 
for the relative complexity of this module in Mq.) The ability of A-Prolog 
to express and to reason with recursion allows us to use the following (slightly 
simplified) concise definition of pressure on non-tank nodes. 

h(pressurized_by(Nl,Tk) ,T) 

not tank_of (Nl ,R) , 
not hdeaking(Nl) ,T) , 
link(N2,Nl,V) , 
h(in_state(V,open) ,T) , 
h(pressurized_by(N2,Tk) ,T) . 



The Plumbing Module consists of approximately 40 rules. 



2.2 Valve Control Module 

The flow of fuel and oxidizer propellants from tanks to jets is controlled by open- 
ing/closing valves along the path. The state of valves can be changed either by 
manipulating mechanical switches or by issuing computer commands. Switches 
and computer commands are connected to the valves, they control, by electrical 
circuits. 
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The action of flipping a switch Sw to some position S normally puts a valve 
controlled by Sw in this position. Similarly for computer commands. There are, 
however, three types of possible failures: switches and valves can be stuck in some 
position, and electrical circuits can malfunction in various ways. Substantial 
simplification of the V CM module is achieved by dividing it in two parts, called 
basic and extended VCM modules. 

At the basic level, it is assumed that all electrical circuits are working prop- 
erly and therefore are not included in the representation. The extended level 
includes information about electrical circuits and is normally used when some 
of the circuits are malfunctioning. In that case, flipping switches and issuing 
computer commands may produce results that cannot be predicted by the basic 
representation. 



Basic Valve Control Module At this level, the VCM deals with a set of 
switches, computer commands and valves, and connections among them. The 
input of the basic V CM consists of the initial positions and faults of switches and 
valves, and the sequence of actions defining the history of events. The module 
implements an Ip-function that, given this input, returns positions of valves 
at the current moment of time. This output is used as input to the plumbing 
module. The possible faults of the system at this level are valves and switches 
stuck at some position (s). 

Effects of actions in the basic V CM are described in a variant of action language 
B [15], which contains both static and dynamic causal laws, as well as impossibil- 
ity conditions. Our version of B uses a slightly different syntax to avoid lists and 
nesting of function symbols, because of limitations of the inference engines cur- 
rently available. The use of B allows to prove correctness of logic programming 
implementation of causal laws [11]. (Of course, it does not guarantee correctness 
of the causal laws per se. This can only be done by domain experts.) The com- 
plexity of this representation makes it hard to employ STRiPS-like formalisms. 

The following rules show an example of syntax and use of our version of B. The 
first is a dynamic causal rule stating that, if a properly working switch Sw is 
flipped to state S at time T , then Sw will be in this state at the next moment 
of time. 

h(in_state(Sw,S) ,T+1) 

occurs(flip(Sw,S) ,T) , 
not stuck (Sw) . 

A static connection between switches and valves is expressed by the next rule. 
This static law says that, under normal conditions, if switch Sw controlling a 
valve V is in some state S (different from gpc^) at time T, then V is also in this 
state at the same time. 

® A switch can be in one of three positions: open, closed, or gpc. When it is in gpc, it 
does not affect the state of the valve. 
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h(in_state(V,S) ,T) 

controls (Sw,V) , 

h(in_state (Sw, S) ,T) , 

neq(S,gpc) , 

not h(ab_input (V) ,T) , 

not stuck(V) , 

not bad_circuitry (V) . 



The condition not bad-circuitry(V) is used to stop this rule from being applied 
when the circuit connecting Sw and V is not working properly. (Notice that 
the previous dynamic rule, instead, is applied independently of the functioning 
conditions of the circuit, since it is related only to the switch itself.) If the switch 
is in a position. S'!, different from gpc, and a computer command is issued to 
move the valve to position 52, then there is a conflict in case 51 yf 52. This is an 
abnormal situation, which is expressed by fluent abJnputiV) . When this fluent 
is true, negation as failure is used to stop the application of this rule. In fact, the 
final position of the valve can only be determined by using the representation of 
the electrical circuit that controls it. This will be discussed in the next section. 



Extended Valve Control Module The extended VCM encompasses the 
basic VCM and also includes information about electrical circuits, power and 
control buses, and the wiring connections among all the components of the sys- 
tem. 

The Ip-function defined by this module takes as input the same information 
accepted by the basic VCM, together with faults on power buses, control buses 
and electrical circuits. It returns the positions of valves at the current moment 
of time, exactly like the basic VCM. 

Since (possibly malfunctioning) electrical circuits are part of the representation, 
it is necessary to compute the signals present on all wiring connections, in order 
to determine the positions of valves. The signals present on the circuit’s wires 
are generated by the Circuit Theory Module (CTM), included in the extended 
VCM. Since this module was developed independently to address a different 
collection of tasks [2, 3], its use in this system is described in a separate section. 

There are two main types of valves in the RCS: solenoid and motor controlled 
valves. Depending on the number of input wires they have, motor controlled 
valves are further divided in 3 sub-types. While at the basic VCM there is no 
need to distinguish between these different types of valves, they must be taken 
into account at the extended level, since the type determines the number of input 
wires of the valve. In all cases, the state of a valve is normally determined by 
the signals present on its input wires. 

For the solenoid valve, its two input wires are labeled open and closed. If the 
open wire is set to I and the closed wire is set to 0, the valve moves to state open. 
Similarly for the state closed. The following static law defines this behavior. 
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h(in_state(V,Sl) ,T) 

input (W1 , V) , 
input (W2 , V) , 
input_of _type (W1 , SI) , 
input_of_type(W2,S2) , 
h(value(Wl,l) ,T) , 
h(value(W2,0) ,T) , 
neq(Sl,S2) , 
not stuck(V) . 



The state of all other types of valves is determined in much the same way. The 
only difference is in the number of wires that are taken into consideration. 

The output signals of switches, valves, power buses and control buses are also 
defined by means of static causal laws. 

At this level, the representation of a switch is extended by a collection of input 
and output wires. Each input wire is associated to one and only one output wire, 
and every input/output pair is linked to a position of the switch. When a switch 
is in position S, an electrical connection is established between input Wi and 
output Wo of the pair(s) corresponding to S. Therefore, the signal present on 
Wi\s transferred to Wo, as expressed by the following rule. 

h(value(Wo,X) ,T) 

h(in_state (Sw, S) ,T) , 
coimects(S,Sw,Wi,Wo) , 
h(value(Wi,X) ,T) . 



The VCM consists of 36 rules, not including the rules of the Circuit Theory 
Module. 



2.3 Circuit Theory Module 

The Circuit Theory Module (CTM) is a general description of components of 
electrical circuits. It can be used as a stand-alone application for simulation, 
computation of the topological delay of a circuit, detection of glitches, and ab- 
duction of the circuit’s inputs given the desired output. 

The CTM is employed in this system to model the electrical circuits of the RCS, 
which are formed by digital gates and other electrical components, connected by 
wires. Here, we refer to both types of components as gates. The structure of an 
electrical circuit is represented by a directed graph E where gates are nodes and 
wires are arcs. A gate can possibly have a propagation delay D associated with 
it, where Z) is a natural number (zero indicates no delay). All signals present in 
the circuit are expressed in 3- valued logic (0, 1, u). If no value is present on a 
wire at a certain moment of time T then it is said to be unknown (u) at T. 




176 Monica Nogueira et al. 



This module describes the normal and faulty behavior of electrical circuits with 
possible propagation delays and 3- valued logic. 

In CTM, input wires of a circuit are defined as the wires coming from switches, 
valves, computer commands, power buses and control buses. Output wires are 
those that go to valves. The CTM is an Ip- function that takes as input the 
description of a circuit C, the values of signals present on its input wires, the set 
of faults affecting its gates, and determines the values on the output wires of C 
at the current moment of time. 

We allow for standard faults from the theory of digital circuits [19, 7]. A gate 
G malfunctions if its output, or at least one of its input pins, are permanently 
stuck on a signal value. The effect of a fault associated to a gate of the direct 
graph E only propagates forward. 

CTM contains two sets of static rules. One of them allows for the representation 
of the normal behavior of gates, while the other expresses their faulty behavior. 
To illustrate how the normal behavior of gates is described in the CTM, let us 
consider the case of the Tri-State gate. This type of component has two input 
wires, of which one is labeled enable. If this wire is set to 1, the value of the other 
input is transferred to the output wire. Otherwise, the output is undefined. The 
following rule describes the normal behavior of the Tri-State gate when it is 
enabled. 

h ( value (W, X) ,T+D) 

delay (G,D) , 
input (W1 ,G) , 
input (W2,G) , 

type_of_wire(W2,G, enable) , 
neq(Wl,W2) , 
h(value(Wl,X) ,T) , 
h(value(W2,l) ,T) , 
output (W,G) , 
not is_stuck(W, G) . 



It is interesting to discuss how faults are treated when they occur on the in- 
put wire of a gate. Let us consider the case of a gate G with an input wire 
stuck at value X. This wire is represented as two unconnected wires, W and 
stuckjwireiW), corresponding to the normal and faulty sections of the wire. 
The faulty part is stuck at value X, while the value of W is computed by nor- 
mal rules depending upon its connection to the output of other gates. Rules for 
gates with faulty inputs use stuck-wire{W) as input wire. The example below 
is related to a Tri-State gate with the non-enable wire stuck to X. 

h ( value (W, X) ,T+D) 

delay (G,D) , 

input (stuck_wire (Wl) ,G) , 
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input (W2,G) , 

type_of_wire(W2,G, enable) , 
neq(Wl,W2) , 

hCvalue (stuck_wire (Wl) ,X) ,T) , 
h(value(W2,l) ,T) , 
output (W,G) , 
not is_stuck(W, G) . 



Notice that condition not isstuck(W,G) prevents the above rules from being 
applied when the output wire is stuck. Whenever an output wire is stuck at X, 
the corresponding rule guarantees that its signal value is always X. 

The behavior of a circuit is said normal if all its gates are functioning correctly. 
If one or more gates of a circuit malfunction then the circuit is called faulty. 

The description of faulty electrical circuit (s) is included as part of the RCS rep- 
resentation. However, it is not necessary to add the description of normal circuits 
controlling a valve (s) since the program can reason about effects of actions per- 
formed on that valve through the basic VCM. This allows for an increase in 
efficiency when computing models of the program. 

The Circuit Theory Module contains approximately 50 rules. 

2.4 Planning Module 

This module establishes the search criteria used by the program to find a plan, 
i.e. a sequence of actions that, if executed, would achieve the goal. The modular 
design of M allows for the creation of a variety of such modules. 

The structure of the Planning Module (PIM) follows the generate and test 
approach described in [8, 20]. Since the RCS contains more than 200 actions, with 
rather complex effects, and may require very long plans, this standard approach 
needs to be substantially improved. This is done by addition of various forms 
of heuristic, domain-dependent information^. In particular, the generation part 
takes advantage of the fact that the RCS consists of three, largely independent, 
subsystems. A plan for the RCS can therefore be viewed as the composition 
of three separate plans that can operate in parallel. Generation is implemented 
using the following rule: 

l{occurs(A,T) : action_of (A,R)}1 
subsystem (R) , 
not goal(T,R). 

This rule states that exactly one action for each subsystem of the RCS should 
occur at each moment of time, until the goal is reached for that subsystem. 
Notice that the head of this rule has the form L{p{X) : q{X)}U. It defines a 

^ Notice that the addition does not affect the generality of the algorithm. 
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subset p Q q oi terms such that L < \p\ < U . Normally, there are many possible 
sets satisfying these conditions. Hence, a program containing this type of rules 
has multiple answer sets, corresponding to possible choices of p. 

In the RCS, the common task is to prepare the shuttle for a given maneuver. The 
goal of preparing for such a maneuver can be split into several subgoals, each 
setting some jets, from a particular subsystem, ready to fire. The overall goal 
can therefore be stated as a composition of the goals of individual subsystems 
containing the desired jets, as follows: 

goal : - 

goal (T1 , lef t_rcs) , 
goal (T2 , right_rcs) , 
goal(T3,fwd_rcs) . 

The plan testing phase of the search is implemented by the following constraint 
: - not goal . 



which eliminates the models that do not contain plans for the goal. 

Splitting into subsystems allows us to improve the efficiency of the module sub- 
stantially. For instance, finding a plan of 6 actions takes 2.14 seconds (Ex3 from 
Table 1), as opposed to hours required when the representation of the RCS is 
not partitioned in subsystems. Notice that, since there are some dependencies 
between some subsystems, a very small number of extremely rare (and undesir- 
able) plans can be missed. It’s possible to modify the Planning module in order 
to find these plans, but this issue was not investigated in this paper. 

The module also contains other domain-dependent as well as domain- 
independent heuristics. The reasons for adding such heuristics are two-fold: first, 
to eliminate plans which are correct but unintended, and second, to increase 
efficiency. A-Prolog allows for a concise representation of these heuristics as con- 
straint rules. This can be demonstrated by means of the following examples. 

Some heuristics are instances of domain-independent heuristics. They express 
common-sense knowledge like “under normal conditions, do not perform two 
different actions with the same effect.” In the RCS, there are two different types 
of actions that can move a valve P to a state S: a) flipping to state S the switch, 
Sw, that controls V, or b) issuing the (specific) computer command CC capable 
of moving V to S. In A-Prolog we can write this heuristic as follows 

:- occurs(flip(Sw,S) ,T) , 
controls (Sw,V) , 
occurs(CC,Tl) , 
commands (CC,V,S) , 
not bad_circuitry (V) . 
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More domain-dependent rules embody common-sense knowledge of the type “do 
not pressurize nodes which are already pressurized.” In the RCS, some nodes 
can be pressurized through more than one path. Clearly, performing an action 
in order to pressurize a node already pressurized will not invalidate a plan, but 
this involves an unnecessary action. Although we do not discuss optimality of 
plans in this paper, the shortest sequence of actions to achieve the goal is a good 
candidate as the optimal plan(s). The following constraint eliminates models 
where more than one path to pressurize a node N2 is open. 

link(Nl,N2,Vl) , 
link(Nl,N2,V2) , 
neq(Vl,V2) , 

h(in_state (VI , open) ,T) , 
h(in_state (V2 , open) ,T) , 
not stuck (VI , open) , 
not stuck (V2 , open) . 



As mentioned before, some heuristics are crucial for the improvement of the plan- 
ner’s efficiency. One of them states that “a normally functioning valve connecting 
nodes N1 and N2 should not be open if is not pressurized.” This heuristic 
clearly prunes a significant number of unintended plans. It is represented by a 
constraint that discards all plans in which a valve V is opened before the node, 
preceding it, is pressurized. 

link(Nl,N2,V) , 
h(in_state(V,open) ,T) , 
not h(pressurized_by (N1 ,Tk) ,T) , 
not has_leak(V) , 
not stuck(V) . 



The improvement offered by domain-dependent heuristics has not been studied 
mathematically here. However, experiments showed impressive results. In the 
case of tasks involving a large number of faults, for example, the introduction 
of some of the most effective heuristics reduced the time required to find a plan 
from hours to seconds. 

Table 1 presents a summary of five experiments, showing the growth of the time 
required, in response to an increase of the complexity of the task. The columns of 
the table indicate: task name; number of RCSs subsystems involved in the task; 
number of steps required to reach the goal; total number of actions required to 
achieve the goal (actions of different subsystems may be executed in parallel); 
number of faults affecting the RCS; time needed to check a plan; time needed to 
find a plan. 

Table 2 reports the timings obtained for variants of the previous experiments, 
where faults have been introduced in circuits of the system. 
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All times are expressed in seconds and were taken on a Pentium II 450MHz 
system, running NetBSD 1. 4. 1, lparse 0.99.59 and SMODELS 2.26. 



Task 


RCSs 


steps 


actions 


faults 


check 


plan 


Exl 


1 


4 


4 


2 


0.82 


2.92 


Ex2 


1 


5 


5 


2 


1.01 


4.74 


Ex3 


2 


3 


6 


0 


0.66 


2.14 


Ex4 


2 


4 


8 


2 


0.82 


5.90 


Ex5 


3 


7 


21 


8 


1.37 


43.52 



Table 1. Results of plan checking and planning on sample tasks without mal- 
functioning circuits. 



Task 


check 


plan 


Exl 


1.11 


3.48 


Ex2 


1.35 


5.55 


Ex3 


0.88 


2.41 


Ex4 


1.11 


6.45 


Ex5 


1.87 


45.80 



Table 2. Results of plan checking and planning on sample tasks with malfunc- 
tioning circuits. 



3 Conclusion 

In this paper we described a medium size decision support system written in 
A-Prolog. This application requires modeling of the operation of a fairly com- 
plex subsystem of the Space Shuttle at a level suitable for use by shuttle flight 
controllers. It is expected that deployment of this system, for use in the space 
program, will begin in December of 2000. The system, while based on previous 
work, represents a substantial advance over its predecessor. 

From the scientific standpoint, this work can be of interest to two groups of peo- 
ple, those interested in answer set programming and those interested in planning. 
We hope both groups will be glad to learn about the existence of a comparatively 
big and practical software system written in A-Prolog. 

The former group can also learn about advantages of A-Prolog with respect 
to standard Prolog, evident even in the case of plan checking. An important 
methodological lesson we learned from this exercise is the importance of careful 
initial design. For instance, introduction of junction nodes in the model of the 
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Plumbing Module of the RCS substantially simplified the resulting program. 
We are also satisfied with our use of the Java interface for selecting modules 
necessary for solving a given problem, and integrating these modules into a final 
A-Prolog program. Structuring most modules as Ip-functions contributed to the 
reusability and proof of correctness of the integration®. Such proof is especially 
important due to the critical nature of the RCS. 

The people from planning may find it interesting to see a system of substantial 
size built on theory of actions and change. In particular, we were somewhat 
surprised by the importance of static causal laws in our model. We are not sure 
that the use of STRIPS-like languages containing only dynamic causal laws is 
sufficient for a concise representation of the RCS, and especially of the extended 
VCM. 

The use of A-Prolog allowed us to deal with recursive causal laws, which may 
pose a problem to more classical planning methods. (Partial solution to this 
problem is suggested in [9], where the authors use CCALC ([22]) to reduce the 
computation of answer sets to the computation of models of some propositional 
formula. They give a sufficient condition of the correctness of such transforma- 
tion. Unfortunately, the idea does not apply here, since the corresponding graph 
is not acyclic.) 

Recent work in planning drew attention to the problem of finding a language 
which would allow a declarative and efficient representation of heuristic informa- 
tion [1, 18, 17, 10]. We believe that this paper demonstrates that a large amount 
of such information can be naturally expressed in A-Prolog. Moreover, its use 
dramatically improves efficiency of the planner (which is not always the case for 
satisfiability based planners.) 

Finally, it may be interesting to see how modularity allows planning to be per- 
formed in different levels. It is easy, for instance, to modify our planning module 
to search for manual plans, i.e., those not including computer commands. The 
new planner will be much more efficient and, in many cases, sufficient for the 
flight controllers’ needs. We have plans of applying these techniques to modeling 
other systems of the Space Shuttle. 
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® To give an example of what we learned here, let us consider the following situation: 
suppose you have Ip-functions / and g correctly implementing the plumbing and basic 
VCM modules of the system; integration of these modules leads to the creation of 
new Ip-function h = fog. It is known that, due to non-monotonicity of A-Prolog, logic 
programming representation of this function cannot always be obtained by combining 
together rules of / and g. In our case, however, a general theorem [11] can be used to 
check if this is indeed the case. We are currently working on formulating and proving 
the correctness of the complete integration. 
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Abstract. Intelligence and interaction are two key- issues in the engi- 
neering of today complex systems, like Internet-based ones. To make 
logic languages accomplish their vocation of sound enabling technologies 
for intelligent components, we first need their implementations to strictly 
meet some engineering properties such as deployability, configurability, 
and scalability. Then, we should provide them with a wide range of inter- 
action capabilities, according to standards and common practices. This 
would make logic-based systems also viable tools to build deployable, 
configurable, dynamic, and possibly intelligent infrastructures. 

In this paper, we present tuProlog, a light-weight Java-based system 
allowing configurable and scalable Prolog components to be built and 
integrated into standard Internet applications according to a multiplic- 
ity of different interaction patterns, like JavaBeans, RMI, CORBA, and 
TCP/IP. Even more, tuProlog offers basic coordination capabilities in 
terms of logic tuple spaces, which allow complex Internet-based architec- 
tures to be designed and governed. This makes it possible to use tuProlog 
as the core enabling technology for Internet infrastructures - as in the 
case of the TuCSoN and LuCe infrastructures for the coordination of 
Internet-based multi-agent systems. 



1 The Role of Logic Languages 

Two fundamental issues in both the ubiquitous computing paradigm [19] and 
the net computing model [18] are: 

— interaction - systems are built by putting together sub-systems (like objects, 
processes, components, agents) that interact so as to achieve a global system 
goal; 

— intelligence - the growing complexity and dynamics of the application sce- 
narios and the ever-growing requirements from non-skilled users call for del- 
egation of responsibilities from users to systems, as well as for distribution 
and localisation of deliberative capabilities within the intelligent components 
of a system. 

Multi-agent systems (MAS henceforth) emphasise both issues, by exploiting 
agents as units encapsulating intelligence and interacting with other agents so 
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as to achieve their goals. Even more, the Internet conveys the request for appli- 
cation intelligence coming from millions of non-skilled users, and represents as 
well a highly-demanding environment for system design and development, given 
its dynamics, heterogeneity, decentralisation, and unpredictability. 

The complexity of Internet-based system engineering calls for suitable infras- 
tructures, meant to make the designers’ and developers’ task easier by provid- 
ing commonly-required services to applications. In particular, easily deployable 
infrastructures are needed, which can (i) he easily configured to match the ap- 
plication needs, both statically and dynamically, (ii) rule component and ap- 
plication interaction, and possibly (Hi) encapsulate some form of intelligence to 
be exploited by applications. In this scenario, where Software Engineering, Pro- 
gramming Languages, and (Distributed) Artificial Intelligence meet, logic-based 
languages are fighting to find a role to play. Roughly speaking, in the common 
perception of computer scientists and engineers, “logic” typically sounds good 
when coupled with “intelligence” , but sounds bad with “interaction” . 

So, on the one hand, the vocation of logic languages is to work as sound 
enabling technologies to build intelligent components. In the context of Internet- 
based application development, this implies that logic languages should be im- 
plemented so as to meet several engineering criteria, such as deployability, seal- 
ability, and interoperability. On the other hand, logic-based languages already 
proved to be effective both as communication and as coordination languages [5], 
as well as in facing specific issues like security in a declarative way. This suggests 
that - contrary to the perception of logic languages outside the research field 
itself - logic-based languages could play a key-role also in the deployment of 
scalable, configurable, intelligent Internet-based infrastructures. 

In this article, we present tuProlog, a Java-based Prolog designed to build 
Internet-based intelligent components, which are (i) easily deployable, (ii) light- 
weight, (Hi) scalable, (iv) statically and dynamically configurable, and (v) in- 
teroperable. For this purpose, tuProlog makes a core Prolog inferential engine 
available as a Java class, so that an unlimited number of tuProlog engines can 
be exploited at the same time by the same application or process. Each tuProlog 
engine can be configured independently, according to the specific needs (in terms 
of both the language extensions and the clause database), and integrated into a 
systems according to the preferred/required interaction pattern: as a Java object, 
a Java bean, via RMI or CORBA, or as an Internet service. 

In addition to the above Internet standard interaction patterns, tuProlog 
integrates basic coordination capabilities, by providing logic tuple spaces as co- 
ordination media, which can be exploited to engineer non-trivial architectures 
involving heterogeneous components interacting through a distributed environ- 
ment like the Internet. This makes tuProlog a good choice as an enabling technol- 
ogy for flexible and effective Internet infrastructures. In fact, tuProlog actually 
works as the core for both the TuCSoN [13] and LuCe [5] infrastructures for the 
coordination of Internet-based multi-agent systems. 

Accordingly, Section 2 discusses in general the issue of interaction and coor- 
dination patterns for a Prolog system in the context of Internet-based system 
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engineering. Section 3 present the tuProlog system and architecture, and shortly 
describes how it is exploited as the core technology of the TuCSoN and LuCe in- 
frastructures. Finally, Section 4 introduces two examples, and Section 5 discusses 
related works and provides some conclusions. 



2 Interaction and Coordination in tuProlog 

The availability of the tuProlog Virtual Machine (VM) as a Java object makes 
the expressive power of a declarative language like Prolog available to Java ap- 
plications. This feature makes it possible, for instance, to express and realise the 
algorithmic part of an intelligent service in a natural (declarative) way, while 
retaining the advantages of an object-oriented language in terms of system or- 
ganisation for software engineering purposes - testing, debugging, portability - 
as well as for development and deployment issues - standard platforms, easy-to- 
use tools, libraries, and so on. But the real benefit of a tuProlog VM as a Java 
object concerns the coordination dimension - in particular, the relationship be- 
tween such a twofold object and the other application components, in terms of 
which kinds of interaction can take place, at what levels of abstractions, and 
what they may be useful for. 

First, tuProlog open architecture enables any Java component to be accessed 
and used from Prolog in a straightforward way by means of the JavaLibrary 
library. In this way, in particular, all interaction-related Java packages, such as 
Swing, JDBC, and RMI, can be seamlessly exploited to increase the interaction 
capabilities of tuProlog. 

Moreover, since the tuProlog VM is a Java class, each application can instan- 
tiate as many VMs as it likes, and use them just as it would use any other object 
for its computing purposes - by reference. More generally, each Prolog concept 
(term, structure, atom) has a Java counterpart that is a Java class, and each 
specific Prolog entity is mapped onto a corresponding Java object. Computa- 
tions can be carried on by choosing at any time the most suitable language and 
level of abstraction, switching between Java imperative object-oriented style and 
Prolog declarative style whenever appropriate. Even more, tuProlog also accom- 
plishes the JavaBeans model, so that its VM can be exploited as a hean by any 
application adopting the JavaBeans architectural pattern. Meanwhile, the Java 
application may well adopt other means, such as the I/O package, to interact 
with remote services / components or with the file systems via streams. 

Also, a tuProlog VM may take the form of a Java thread class, so that any 
Java application can start its own Prolog-solving thread(s) whenever necessary: 
each thread may then be charged of proving a given goal with respect to a given 
theory, and terminate after the proof is over. For instance, a Java application 
may provide a sort of Prolog “goal solving service” , taking the goal to be solved 
(as well as the related theory) from the operating system’s command line. Here, 
the Java application would be charged of handling the user interface, and of 
starting the tuProlog VM thread to actually get the required solutions. In this 
scenario, despite the threaded nature of the tuProlog VM object, inter-object 
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interaction would still basically occur as above - i.e., by reference - though the 
exploitation of some shared objects. 

Even more, in the Internet world, a typical approach to provide standard 
services is to put a so-called daemon waiting for service requests on a stan- 
dard port. In this context, a user-configurable Prolog inference engine available 
in that form is likely to be an appealing service. So, tuProlog makes an ever- 
running Java daemon available and deployable, providing in principle any active 
entity on the Internet with the ability of accessing a Prolog service to solve his 
own goals based on his own theories. Still, point-to-point, direct communication 
pattern is adopted in the daemon pattern, where interacting entities are strictly 
coupled both in time and in space. A more sophisticated interaction pattern is 
made available by the RMI and CORBA packages, making tuProlog even a more 
flexible tool for heterogeneous and distributed applications. 

However, these approaches do not completely solve the problems of interac- 
tion coupling, and in some sense prevent more complex forms of coordination 
other than peer-to-peer models to be enacted [2] . So, tuProlog makes it possible 
to adopt a logic tuple-based coordination abstraction as a unifying interaction 
metaphor, and to exploit it either locally - i.e., to support interaction of com- 
ponents within a single Java application ~ or throughout the Internet, with the 
support of a coordination infrastructure like TuCSoN or LuCe. First, both Java 
and tuProlog components can interact via a “default” Linda-like tuple space, us- 
ing standard Linda primitives [8]. Despite its simplicity, this approach provides 
all the properties of tuple-based indirect communication [14]: name uncoupling 
(communicating entities do not have to name each other to interact) space uncou- 
pling (communicating entities do not have to be in the same place to interact), 
and time uncoupling (communicating entities do not have to coexist in time 
to interact), which are of valuable help when designing and developing open 
systems made of several heterogeneous and distributed components [2]. 

Beyond tuple spaces, the power of tuple centres is available to tuProlog com- 
ponents [12]. Featuring the fundamental property of programmability of the coor- 
dination medium, tuple centres make it possible to govern the interaction among 
the components of a system. The system’s global behaviour can thus be governed 
by expressing interaction laws as coordination laws embedded into the coordina- 
tion media - the tuple centres. In particular, both Java and tuProlog components 
can exploit ReSpecT tuple centres as their coordination media - where ReSpecT 
is a logic-based language used to specify tuple centre’s behaviour [4] . 

Finally, in the case of Internet-based multi-agent systems, both Java and 
tuProlog agents can exploit either the TuCSoN or the LuCe infrastructure, which 
both rely on ReSpecT tuple centres as their coordination media. In this way, any 
Java or tuProlog agent can in principle interact with any other agent on the In- 
ternet in either a network-aware (in TuCSoN) or network-transparent fashion (in 
LuCe) through a multiplicity of tuple centres distributed over the Internet. How- 
ever, since a complete coverage of coordination infrastructures and architectures 
is outside the scope of this paper, readers interested in an in-depth discussion of 
TuCSoN and LuCe are kindly referred to [13] and [5], respectively. 
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3 tuProlog: Architecture and Use 

The architecture of tuProlog is first meant to satisfy three main requirements: 

~ a minimal, efficient, ISO-compliant Prolog VM, available as a self-contained 
Java object, featuring a simple, minimal interface; 

— extendibility of the core by means of dynamically linkable/dischargeable li- 
braries of built-in predicates; 

— availability of effective tools enabling interoperability and managing inter- 
action between Prolog and Internet components. 

The first feature makes a light-weight Prolog engine available to any Java entity, 
bringing the expressiveness of a standard declarative language into the de facto 
standard object-oriented framework - making it possible, in principle, to take 
the best from the two worlds. The second feature is the necessary counterpart 
of the minimality required from the tuProlog core: in this way, features such as 
special computational abilities, domain-specific knowledge, and resource accessi- 
bility, can be made available to the inference engine according to the user’s need. 
Perhaps the most relevant, the third feature calls for the availability of a multi- 
plicity of interaction technologies and coordination models and (TCP/IP, RMI, 
CORE A, tuple-based media), making it possible to easily integrate tuProlog 
components in the design and deployment of complex Internet-based systems. 

3.1 The Prolog Core 

In order to enable tuProlog resources to be exploited from a Java context, a Java 
representation of the relevant Prolog entities must be given, including Prolog 
standard data objects and theories. 

Prolog standard data objects, such integers and fioats, are mapped into Java 
Term objects. This class is actually the root class of this hierarchy, providing 
both direct handling of primitive types and the basic interface (in particular, the 
unify method) for all terms. More specific classes, such as Compound and Var, 
extends Term as appropriate. Prolog variables are mapped into Var objects, each 
labelled by a string. Prolog compound terms, consisting of a functor and a list 
of arguments, are represented as Java Compound objects, which are Terms whose 
functor is a string, and whose arguments are Terms themselves. In particular. 



Theory theoryl = new TheoryCnew FileInputStreamC'theory.p")) ; 

String theorystring = "append(X, [] ,X) .\n" + 

"append([X|Ll] ,L2, [X|L3]) append(Ll,L2,L3) .\n" ; 
Theory theory2 = new TheoryCnew ByteArraylnputStream(theoryString) ) ; 



Table 1. Instantiating new tuProlog theories from either a text file or a String 
object. 
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public interface Prolog { 

void setTheory (Theory th) throws InvalidTheoryException; 

Theory getTheoryO ; 

void loadLibrary (String libname) throws InvalidLibraryException; 
void unloadLibrary (String libname) throws UnknownLibrary ; 
Solveinfo solve(Term t) ; 

Solveinfo solve(String term) throws Half ormedGoalException; 
Solveinfo solveNextO throws NoMoreSolutionException; 

} 



Table 2. The core Prolog interface. 



strings are mapped as Compounds with no arguments, and lists are mapped as 
Compound objects with functor and two Term arguments (head and tail). 

Prolog theories are represented by the Java class Theory, whose instances 
are built from a textual representation taken by any possible input stream. For 
instance, a Theory object may be built from a file or a string, as shown in 
Table 1. The (minimal) interface provided by Theory enables a theory to be (i) 
appended to another theory, and (ii) written to an OutputStreaun- for instance, 
to be saved to a file. 

The Prolog core itself is represented by the Prolog class, whose instances are 
Prolog VMs. This class provides a minimal interface that enables users to: 

— set/get the theory to be used for demonstrations; 

— load/unload libraries; 

— solve a goal, represented either by a Term object or by a textual representa- 
tion (a String object) of a term. 

Table 2 shows the core Prolog interface. The solve method provides the first 
solution. Then, the user can (iteratively) require further solutions using the 
solveNext method. Both methods return an object implementing the Solveinfo 
interface (Table 3), which makes the result of the proof available. 

A comprehensive discussion of the tuProlog VM structure is clearly out of 
the scope of this paper: however, few essential features are worth to be pointed 



public interface Solveinfo { 
boolean successO; 
boolean hasOpenAlternativesO ; 

Substitution getSubstitutionO throws NoSolutionException; 
Term getSolutionO throws NoSolutionException; 
boolean haltO; 

int getHaltCode 0 throws NoHaltException; 



Table 3. The Solveinfo interface. 
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^elements 


tuProlog 


Jinni 


W-Prolog 


CKI Prolog 


jProlog 


30 


0.03 


0.09 


0.11 


<1 


8 


60 


0.15 


0.35 


0.28 


1 


100 


120 


1 


1 


0.74 


2.5 


> 500 


180 


3.5 


2 


1.6 


8 


> 500 


240 


9 


4 


3 


52 


> 500 


300 


16 


6 


> 500 


122 


> 500 



Table 4. Naive list reverse benchmark (in seconds). 



import tuprolog. vm. ♦ ; 
import java.io.*; 

public class PrologTest { 

public static void main (String args[]) { 
try { 

// virtual machine creation (standard ISOLibrary ar linked by default) 

Prolog engine=new PrologO ; 

// consult a theory from the file indicated by first argument 
engine . set Theory (new Theory(new FileInputStream(args [0] ) ) ) ; 

// ask the engine to solve the goal given as a string by second argument 
Solveinf o inf o=engine . solve (args [1] ) ; 
if (! inf o . success 0 ) { 

System. out .printlnC'no . ; 

} else { 

if ( ! inf o .hasOpenAlternatives 0 ) { 

System. out .printlnC'yes : " + info.getSolutionO) ; 

} else { 

Buf f eredReader stdin = new Buf f eredReader ( 
new InputStreamReader (System . in) ) ; 
while (inf o .hasOpenAlternatives 0 ) { 

System. out .println(info . getSolutionO + " ?"); 

String answer = stdin. readLineO ; 
if (! answer . equals (";")) { 

System, out .printlnC'yes : " + info.getSubstitutionO) ; 
break; 

} else { 

inf o=engine . solveNext () ; 

if (! inf o . success 0 ) { System. out .printlnC'no .") ; break; } 

} 

} 

} 

// save the current theory on the file specified by the third argument 

new FileQutput Stream (args [2] ). write (engine . getTheory () . toStringO . getBytesO ) ; 

catch (Exception e){ return -1; } 

} 

} 



Table 5. An example of use of tuProlog from Java. 
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out. First, the tuProlog VM is basically implemented as an object-oriented SLD 
solver, where all the terms involved in a proof (including the clauses) are repre- 
sented as Java objects. In particular, being Term objects or instances of derived 
classes, all Prolog data objects offer services such as variable renaming and uni- 
fication. Then, with respect to traditional implementations, the tuProlog does 
not use the trail stack, since every variable keeps track of its bindings, and is 
able to automatically restore its previous state in case of backtracking. 

A new implementation of the Prolog VM obviously raises the issue of per- 
formance. Even though this was not a primary concern of this project - which 
aims at achieving reasonable rather than maximal efficiency - extensive tests are 
currently being carried out. Early results indicate that tuProlog is reasonably ef- 
ficient, even in the current non-optimised version. For instance. Table 4 reports 
the results of the Warren’s well-known naive reverse benchmark, executed on a 
400 Mhz Pentium II machine with 256 MB of memory, using JDK 1.3. 

Table 5 sketches the use of the tuProlog VM from within a Java application: 
given the name of a file containing the textual representation of a theory as 
the first input argument, and a goal to be proved as the second argument, the 
application tries to prove the goal, producing solutions as unified goals. When 
the proof is over, the theory is saved on a file whose name is specified as the third 
argument. The simple structure of the interface also allows the tuProlog core to 
be used as a Java bean, which can be exploited by any visual programming tool as 
easily as any other bean component. As an example, a JavaBeans implementation 
of the tuProlog IDE is included in the distribution [17]. 

In order to enable the construction of pure Prolog components, such as in- 
telligent agents, tuProlog allows Prolog independent processes to be used to 
prove a goal (with respect to a given theory) in a separate thread via the 
tuprolog. runtime. Runner class. There, a Prolog agent can be easily initiated 
by defining its associated logic theory, and passing it the startup goal as a para- 
metric doorway to affect its behaviour as defined by the theory. In the current 
tuProlog implementation, this service is provided in two forms: as a Java object 
that can be freely exploited by any Java program, and as a ready-to-use Java 
application, where the theory and the goal to be proved are given as command 
line arguments. 

3.2 Core Extendibility 

A fundamental ingredient of tuProlog architecture is represented by its built- 
in libraries, whose Java representation is based on a class hierarchy rooted the 
in Library class. Any library must derive from this class, which imposes the 
implementation constraints needed to ensure that interaction between the Prolog 
core and the library itself occurs properly. Currently, the following basic libraries 
are supplied: 

— MetaLibrary, providing primitives to consult a theory, to load/unload a 
library, and to spawn new Prolog processes; 

— JavaLibrary, providing an interface to access, create and manage Java ob- 
jects, classes, and packages from Prolog sources; 
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— ISOLibrary, providing the standard ISO Prolog built-in predicates; 

— LogicTSLibrary and RespectTCLibrary, enabling tuProlog components to 
exploit logic tuple spaces and ReSpecT logic tuple centres, respectively, as 
coordination media within a Java process; 

— TucsonLibrary and LuceLibrary, empowering tuProlog with the full access 
to the TuCSoN or LuCe Internet infrastructure, respectively. 

For practical use, MetaLibrary, JavaLibrary, and ISOLibrary are loaded au- 
tomatically by the default Prolog constructor: of course, a minimal Prolog core 
can be obtained anyway. At the implementation level, the dynamic extendibility 
of the core is achieved by exploiting the Java reflection core, getting at run-time 
the interfaces to instances of dynamically loaded libraries. 

3.3 The JavaLibrary Library 

JavaLibrary enables Java components to be directly accessed and used from 
Prolog in a simple and effective way, thus delivering all the power of all the 
power of existing Java software to tuProlog sources. In this way, all Java pack- 
ages involving interaction, such as Swing, JDBC, and RMI, are immediately 
available to increase the interaction abilities of tuProlog. JavaLibrary provides 
the following basic predicates: 

— java_object/3 creates a new Java object of the specified class, possibly 
providing the required arguments to the class constructor. The reference to 
the newly-created object is bound to a Prolog identifier (a ground term) 
which can be used to operate on the object from Prolog. 

— <-/2 invokes a method on a Java object represented by its Prolog identifier, 
according to a “send message” pattern. With some further syntax, the same 
predicate allows both access to public flelds and invocation of static class 
methods. 

— returns/2 unifies the value returned from the invocation of a non- void Java 
method with a Prolog term. 

To better understand the use of JavaLibrary, consider the example below: 

java_object ( ’tuprolog. demo . Counter ’ , [] , myCounter) , 

myCounter <- set (5), 
myCounter <- get returns X, 
write (X) . 

Here, a Counter object (whose class file is tuprolog/demo/Counter. class) is 
created, and bound to the Prolog constant myCounter. This constant is then 
used for method invocation via the <- operator, calling first the (void) setO 
Java method with argument 5, then the get () Java method without arguments. 
Since the get method returns an integer value, the returns operator retrieves 
the method result (hopefully, 5) and unifies it with the X Prolog variable, which 
is Anally printed via the write/1 built-in method. The only requirement for this 
example to run is the presence of the Counter . class file in the proper position in 
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% using a Swing component from a tuProlog program 
test_open_f ile_dialog( FileName ) 

java.object ( ’ javax . swing . JFileChooser ’ , [ ] , Dialog) , 
Dialog <- showOpenDialog((null as ’ java. awt . Component ’) ) 
Dialog <- getSelectedFile returns File, 

File <- getName returns FileName. 



% using JDBC API to access a database called ‘ ^distances’ ’ 
test_access_dbase 

’ java. lang. Class’ <- f orName ( ’ sun . jdbc . odbc . JdbcOdbcDriver ’ ) , 

’ java. sql .DriverManager ’ <- getConnectionC ’jdbc : odbc ; distances 
returns Connection, 

Connection <- createStatement returns Statement, 

Statement <- executeQueryC ’ SELECT city _from,city_to, distance FROM distances’) 
returns Res, 

Res <- next returns true, !, 

Res <- getStringC ’ city_f rom’ ) returns From, 

Res <- getStringC ’ city.to ’ ) returns To, 

Res <- getint ( ’distcince ’ ) returns Dist, 
assert(distance(From, To, Dist)). 



Table 6. Choosing a file via a Swing GUI (top) and accessing a database via 
JDBC (bottom) from tuProlog via JavaLibrary. 



the file system, according to Java conventions. In particular, unlike other Prolog 
implementations offering a Java/Prolog interface, tuProlog does requires neither 
special headers nor preliminary operations of any sort, like pre-compilation. 

So, JavaLibrary allows GUI components such as Swing-based interfaces to 
be seamlessly created and used from Prolog. In the same way, databases can 
be accessed by exploiting Java JDBC service to make SQL queries and retrieve 
results, as well as to handle Internet protocols such HTTP or FTP, to exploit 
libraries handling XML, etc. For the sake of concreteness. Table 6 shows a 
couple of small examples: the first exploits Swing to graphically choose a file 
from Prolog, while the other uses JDBC to access a database via SQL and stores 
the retrieved data in the Prolog database, ready for subsequent reasoning. 

JavaLibrary implementation strongly relies on the Java reflection API, both 
to create objects from the class name and parameter descriptions, and to iden- 
tify and invoke methods by their name (strings) and argument list. More gener- 
ally, reflection plays a relevant role in complying to all the previously-advocated 
tuProlog requirements about dynamicity, expressiveness, and flexibility. 

3.4 Interaction Support 

According to the path sketched in Section 2, the easiest way to deploy the 
tuProlog engine is to exploit the classical client/server interaction pattern so as to 
enable Prolog VMs to be used remotely via the Internet (TCP/IP). This requires 
two components, one on the client side, and another one on the server side. In 
tuProlog, the Prolog proxy (namely, the tuprolog. runtime. tcp. Proxy class) 
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module tuprolog { 
module runtime 
module corba 

module daemon { 

struct Solveinf oCORBA { 
booleaui success; 
boolean hasOpenAlternatives ; 
string solution; 
string substitution; 
boolean halt; 

^ long haltCode ; 

interface PrologCORBA { 

void clear Theory () ; 
string getTheoryO ; 
void setTheoryCin string theory) ; 

Solveinf oCORBA solve (in string g) ; 
Solveinf oCORBA solveNextO; 
void solveHaltO; 
void solveEndO ; 

void loadLibrary (in string className) ; 
void unloadLibrary (in string className) ; 

void setSpy(in booleEui on) ; 
boolean isSpyO ; 



Table 7. The tuProlog CORBA IDL (tuprolog. runtime, corba. Daemon class). 



is the client-side component: its purpose is to enable interaction with a remote 
tuProlog server, providing the user with the same interface that is available when 
using the Prolog VM as a Java object. The Prolog daemon component (namely, 
the tuprolog. runtime .tcp .Daemon class), on the server side, consists of a living 
process waiting for client requests on a specific IP port. 

In order to take advantage of the available standard middleware infrastruc- 
tures, we developed the above interaction patterns using not only the classical 
socket interface, but also in two more versions - namely, Java RMI and CORBA 
(the tuprolog. runtime. rmi and tuprolog. runtime. corba packages, respec- 
tively). In principle, the last can be exploited to enable non- Java clients to 
interact with tuProlog components (see Table 7). 

Furthermore, in order to make it possible to exploit the power of tuple- 
based coordination for complex Internet application design and deployment [2], 
tuProlog is enriched with several packages and libraries. More precisely, to en- 
able different components/threads of a single Java application to interact via 
a tuple space/centre, four more packages are supplied besides the tuprolog 
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one: tuplemedium, which provides the tuple space/centre Java interface (since 
these two media look the same from the viewpoint of the interacting entities), 
and the triple logictuple, logicts and respecttc, which provide an imple- 
mentation of logic tuple spaces and ReSpecT tuple centres, respectively [12]. In 
particular, these coordination models are made available to tuProlog sources via 
LogicTSLibrary and RespectTCLibrary). 

Finally, TucsonLibrary and LuceLibrary, along with tucson and luce 
packages, enable tuProlog components to be integrated and exploited in the 
design, development, and deployment of complex applications such as Internet- 
based multi-agent systems, by exploiting the power of the T uCSoN [13] and LuCe 
[5] coordination infrastructures. 

3.5 Towards a Coordination Infrastructure 

One of the main aims of tuProlog was to provide a Prolog VM that could be 
easily exploited as a core component of both TuCSoN and LuCe coordination 
infrastructures. Even though an exhaustive discussion of TuCSoN and LuCe is 
clearly outside of the scope of this paper, in short both infrastructures are ba- 
sically meant to provide suitable coordination technologies for the design and 
development of complex Internet-based multi-agent systems [2]. In particular, 
despite their difference in the mapping of the Internet topology, both TuCSoN 
and LuCe exploit logic tuple centres for the coordination of Internet agents [12]. 
So, Internet agents interact with each other, as well as with Internet-based re- 
sources, by exchanging logic tuples, while logic tuple centres coordinate agent 
interaction according to their own behaviour specification. In both TuCSoN and 
LuCe, the behaviour specification of logic tuple centres encapsulates the laws of 
agent coordination, and is expressed via ReSpecT specification tuples - where 
ReSpecT itself is a logic-based language. 

A fundamental challenge was then to build the ReSpecT VM upon tuProlog. 
In fact, the tuProlog VM is currently used as the core technology for the ReSpecT 
VM, which is the basic component for both TuCSoN and LuCe infrastructures. 
For this purpose, first, ReSpecT logic tuples have been built as tuProlog terms. 
Then, the behaviour of ReSpecT built-ins (such as injr, out_r, etc.) has been 
defined in a separate module, and loaded by the tuProlog VM when the tuple 
centre boots. Finally, the behaviour of the ReSpecT VM has been captured in 
terms of a simple tuProlog interpreter. 

tuProlog has proved to be an effective core technology for Internet-based 
infrastructures like TuCSoN and LuCe. Besides, experience has suggested that, 
more generally, logic-based languages and systems could be exploited as powerful 
tools for flexible infrastructures meant to support the design, development, and 
deployment of complex applications like intelligent multi-agent systems. 

4 Case Studies 

In the following, two case studies are briefly introduced, which were developed to 
test the effectiveness of tuProlog in the construction of Internet-based systems. 
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think. 


public class Philo implements Runnable { 


eat . 


int fl, f2; 




public void runO { 


mainLoopCFl ,F2) 


LogicTuple ticket = new LogicTupleC'ticket") ; 


think, 


LogicTuple leftChop = new LogicTupleC'chop" ,new Term(fl)); 


in(ticket) , 


LogicTuple rightChop = new LogicTupleC'chop" , new Term(f2)); 


in(chop(Fl) ) , 


while (true) { 


in(chop(F2) ) , 


think 0 ; 


eat , 


t space . in (ticket) ; 


out (chop(F2) ) , 


t space . in (leftChop) ; 


out (chop(Fl) ) , 


tspace . in (rightChop) ; 


out (ticket) , 


eat 0 ; 


mainLoopCFl ,F2) . 


tspace . out (leftChop) ; 
tspace . out (rightChop) ; 




tspace . out (ticket) ; 

} 

} 

} 



Table 8. Dining Philosopher agents in tuProlog (left) and Java (right). 



The first is the classic Dining Philosopher system [7], the second is a pair of 
game applications, namely TicTacToe and Reversi. 

In Dining philosophers example, a set of agents (philosophers) can either 
think or eat: when a philosopher wants to eat, he has to compete with his neigh- 
bours in order to conquer the resources (chopsticks) needed to eat. In particular, 
in the case of n philosophers agents and n chopsticks, philosopher k needs chop- 
sticks k and k mod n -I- 1 to eat. Here, a Java startup program spawns five 
philosophers: some are Prolog philosophers, whose behaviour is expressed as a 
Prolog theory, the others are Java agents. Table 8 shows most of the code for 
tuProlog and Java philosopher agents. 

Then, tuProlog has been used to build two game applications around a 
ReSpecT tuple centre: TicTacToe and Reversi. There, in short, human play- 
ers can challenge the computer player or fight against each other. In both 
games, pure Prolog agents work as both intelligent human opponents and ad- 
visers (expert agents) and also as game coordinators (master agent), whereas 
Java agents are mainly used for the GUI, as well as for global application coor- 
dination. The same classes of agents as well as the same coordination protocols 
are used for both applications: in particular, the same ReSpecT code is used to 
express the coordination laws for both TicTacToe and Reversi. For instance, Ta- 
ble 9 shows the tuProlog code of the master agent, as well as the ReSpecT code 
used to coordinate its interaction within the two applications. For an extensive 
discussion of these examples, we forward the interested reader to [6]. 

5 Related Works and Conclusions 

The most known amongst the plethora of Java-based implementations of Prolog 
are probably MINERVA [II], Jinni [9], jProlog [10], CKI Prolog [3], W-Prolog 
[20] (for an updated list, see [16]). With respect to such systems, tuProlog is more 
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master : - 

initialise , 
main_loop . 

initialise : - 

Z setting tuple centre behaviour 
set_spec(f ile ( ’game . tcs ’ ) ) , 

Z initializing tuple centre 
out (init ( [first (x) , 
opponent (x,o) , 
opponent (o,x) , 
init(g(.,_,_, 
nextlD(l) 

])). 

main_loop : - 

Z asking for a new game eind related Id 
in(new(ID) ) , 

Z spawning a new Prolog Expert agent 
prolog(expertdD) , 

’ expert .p ’ , expert (ID, 1) , 
C’tuprolog.lib.RespectTCLibrary’] ) , 

Z loop again 
main_loop . 



Z out (init ([+...]) ) 

reaction(out (init (Tuples) ) , ( 
in_r (init (Tuples) ) , 
out _r (list (Tuples) ) 

)). 

react ion (out _r (list ( [Tuple I Tuples] ) ) , ( 
out _r (Tuple) , 
out _r (list (Tuples) ) 

)). 

reaction(out_r (list (Tuples) ) , ( 
in_r(list(Tuples)) , 

)). 

Z in (new (-ID)) 

reaction(in(new(_) ) , (post, 
current_tuple(new(ID) ) , 
rd_r (status (ID ,playing(Role) ) ) , 
rd_r (opponent (Role , Opp) ) , 
out_r (free(ID,Role) ) , 
out_r (f ree(ID, Opp) ) , 
out_r( join (ID, Role) ) 

)). 



Table 9. tuProlog master agent (left) and the corresponding ReSpecT fragment 
(right). 



focused on the openness of the overall architecture, as well as on the minimality 
and the extendibility of the Prolog core. Other key issues in tuProlog are the 
deployment of the core and its integration with the other relevant Java standard 
technologies: accordingly, the tuProlog VM has been designed as a Java bean, so 
as to make it easily exploitable in a standard way by the modern visual tools. For 
the same reason, tuProlog is provided with both an RMI and a CORBA interface, 
which enable the Prolog VM to be exploited by means of the most common 
standard middleware technologies for distributed applications development. 

From the interaction viewpoint, the main purpose of tuProlog is to provide 
several different ways of interacting in the context of a clean and well-defined 
framework, which could be easily adapted to any specific architecture. This im- 
plies that tuProlog has to provide a reliable technology, complying to the current 
Internet standards, as well as to support a suitably expressive range of models of 
coordination, allowing interactions among heterogeneous (Prolog and Java) en- 
tities (components, agents, applications) to be enacted and effectively managed. 

BinProlog [1] and SICStus Prolog [15], for instance, provide a Linda-like co- 
ordination model, but focus on interaction among homogeneous (Prolog only) 
components. Moreover, minimality seems to be not an issue in both the systems. 
Jinni [9] is a pure logic programming language used as a scripting tool for gluing 
knowledge processing components and Java objects together: the language itself 
can be used to easily express mobile agents behaviour, weaving computation 
and coordination in the same linguistic tool. tuProlog approach is quite differ- 
ent, in that we desired a clean separation between the two orthogonal aspects 
of computation and interaction: thus, tuProlog is meant for computation only 
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(Prolog processes, components, agents, . . • while coordination is charged upon 

Linda-like coordination primitives, as well as on the ReSpecT language. 

Code and documentation are freely available at the tuProlog site [17]. 
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Abstract. Logic programming is based on the idea that computation 
is controlled inference. The Extended Andorra Model provides a very 
powerful framework that supports both co-routining and parallelism. We 
present the BEAM, a design that builds upon David H. D. Warren’s orig- 
inal EAM with Implicit Control. The BEAM supports Warren’s original 
EAM rewrite rules plus eager splitting and sequential conjunctions. We 
discuss the main issues in the implementation of the BEAM and show 
that the EAM with Implicit Control can perform quite well when com- 
pared with other implementations that use the Andorra principle. 



keywords'. Logic Programming, Execution Mechanisms, Language Implemen- 
tation. 



1 Introduction 

Logic programming is based on the idea that computation is controlled infer- 
ence. Prolog is the most popular example of a logic programming language, and 
has been successfully used in applications such as artificial intelligence, database 
programming, circuit design, genetic sequencing, expert systems, compilers, sim- 
ulation, and natural language processing. Prolog relies on SLD resolution and 
uses a straightforward left-to-right selection function and depth-first search rule. 
This computation rule is simple to understand and efficient to implement but, 
unfortunately, it is not the ideal rule for every logic program. In some cases 
the limitations of Prolog lead programmers to convoluted and non-declarative 
programs. Correct logic programs may run slowly, or loop. 

Research on improving the performance of logic programs has focussed on 
three main alternatives: co-routining, tabling, and parallelism. Essentially, co- 
routining allows goals to execute only when sufficient data is available. The idea 
has been discussed since the initial days of logic programming [5, 4, 19] and most 
modern Prolog systems support some form of co-routining. Work on tabling is 
more recent: the idea is to reuse intermediate solutions to a query. The XSB- 
Prolog system [20] has shown that tabling can be quite useful in opening up novel 
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applications for logic programming. Lastly, several different forms of parallelism 
can be exploited in logic programs with good results [16, 1, 10, 6, 21, 17]. 

Ideally, we would like novel computational models for Prolog to achieve the 
following goals, presented by David H. D. Warren [25] as, in order of priority: 

— Minimum number of inferences: this is achieved by trying never to repeat the 
same execution step of one inference in different locations of the execution 
tree. 

— Maximum parallelism: this is achieved by allowing goals to be executed as 
more independently as possible, and by doing the combination of all solutions 
as late as feasible. 

The Extended Andorra Model [25] , or EAM, provides a powerful framework 
for the exploitation of both co-routining and parallelism in logic programs. The 
key ideas for this model can be described as: 

— Goals can execute (in parallel) as long as they are deterministic or they do 
not need to bind external variables; 

— If a goal must bind external variables non-deterministically, the computation 
of this goal will split. 

Warren’s original goal for the EAM was to improve the efficiency of controlled 
inference. We followed Warren’s approach in developing the BEAM, a novel 
system that contributes upon the original EAM work by: 

~ Providing a complete description of an EAM kernel as a set of rewrite and 
control rules, and evaluating these rules through a prototype implementa- 
tion. We call this kernel design the BEAM, Basic design for Extended An- 
dorra Model. 

— Studying how to take the best advantage of the EAM with the least program- 
mer intervention. In the spirit of Kowalski’s original definition, and building 
upon Warren and Gupta’s original work [9], we implemented different ap- 
proaches to exploiting control and contrast them to the guard-style approach 
used in AKL. 

— Researching on novel implementation techniques for the EAM, including 
efficient implementations of splitting. 

Our first results are very promising. The system achieves acceptable base per- 
formance, similar to other EAM applications. In this work we show a group of 
applications where the system benefits from the advanced search inherent to the 
EAM. Moreover, we show that implicit control can be in fact quite effective for 
a sizeable number of applications, and that simple annotations can contribute 
to further improvements with little programmer effort. 

Our work contrasts with Haridi and Janson’s Agents Kernel Language [11], a 
powerful concurrent language based on the EAM. AKL was thus the first system 
to show that the key ideas in the EAM can be implementable with good results. 
Programming in AKL is thread-based: threads communicate through variables 
and perform encapsulated search. This was a novel paradigm, and, arguably, 
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AKL programming requires considerable effort in order to take best advantage 
of the concurrency inherent to the language. In contrast, we see the EAM as a 
contribution to logic programming. We would like programmers to be able to 
experiment with pure Horn clause programs, using minimal annotations to im- 
prove control when required. Our first results do show that our base performance 
is similar to the AKL’s, thus suggesting that our approach is quite viable. 

The EAM has also influenced Pontelli and Gupta’s work on Extended Dy- 
namic Dependent And-Parallelism [8] . Their work shows how the EAM ideas are 
important in parallel logic programming systems, although in their case they do 
require parallelism in order to achieve some of the features of the EAM. 

The paper presents an overview of the main implementation issues in the 
BEAM Andorra Model and discusses how control is implemented. Next, we dis- 
cuss the main implementation techniques used in the BEAM. We then focus on 
the implementation of splitting. In the continuation we present initial perfor- 
mance results, and suggest some conclusions and future work. 

2 The Extended Andorra Model 

The EAM can be presented as a set of computing rules that perform refutation of 
goals over Horn Clause programs. Compared to the traditional Prolog strategy, 
the EAM uses two strategies to reduce search: 

1. The EAM follows the Andorra Principle and selects deterministic goals 
first [24]. A first advantage is that deterministic goals need to be tried only 
once, rather than re-executed at different branches of the search space. The 
second advantage is that bindings from the deterministic goals may reduce 
the number of alternatives for other goals, and even make them deterministic 
as well. 

2. The computation is represented as nested conjunctions and disjunctions, that 
is, as an and-or tree, not as a conjunction of a clauses. This allows reducing 
the scope of non-deterministic operations. 

Next, we describe the basic ideas in the EAM, and how they are implemented 
in the BEAM. Our definitions are based on Warren’s original definitions [25]. 

2.1 Execution Principles for the EAM 

The EAM is formally defined by rewrite rules that manipulate and-or trees. Each 
node of the tree may be either an and-hox or an or-box. 

— And-box: [3Xi , . . . , Xm ■ . . . &G„] 

An and-box corresponds to a clause with sub-goals Gi to G„. The variables 
Xi to Xm represent the variables created in the box and a represents the 
bindings on external variables imposed by the and-box. We say that this 
and-box is the home for variables Xi, . . . , Xm- Throughout this paper we 
will assume the Herbrand domain, but the EAM naturally supports other 
constraint systems, such as finite domains [18, 3]. 
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— Or-box: {C\ V ... V Cn} 

An or-box corresponds to an unfolded goal. Each Ci represents an alternative 
clause for the goal. 

Unlike the original EAM and AKL, BEAM does not support choice-boxes. The 
BEAM rewrite rules support prunning operators using and-boxes and or-boxes. 

We say that a computation is compact when all children of and-boxes are 
or-boxes, and when all children of or-boxes are and-boxes. Moreover, a variable 
is said to be external to an and-box when not defined in the current and-box, 
and local otherwise. 

There are four basic operations in BEAM: 

1. Reduction: resolves a goal G against the heads of all clauses defining the 
procedure for G. 

G ^ {[3Ui : aikCi] V . . . V [3U„ : a„&C„]} 

Similar to the original EAM reduction rule, given a goal G, reduction replaces 
G by a new or-box with several alternatives to be explored. 

2. Promotion: promotes an and-box, that is a single alternative of an or-box, 
to the parent and-box. This promotion rule is similar to that of the original 
EAM, but here it is applied in two diferent steps: 

(a) [3A : (T & A & {[3E : 9 k G]} k B]} [3X,Y : aO k A k {[G]} k B] 

(b) [3A, Y : a9 k A k {[G]} k B]^ [3A, Y a9 k A k C k B] 

The first step propagates bindings from an and-box to the one above. The 
second step compacts the and-or tree, but can only be applied if the box to 
be promoted does not include pending prunning operators. 

3. Propagation: propagates bindings downwards in the and-or tree: 

[3A, W,Z-.Z = 9{W)k ...&{... V [3E : G] V ^ 

[3A, W,Z-.Z = 9{W)k ...&{... V [3E : Z = 9{W)kG] V ...}&.. .] 

The promotion rule propagates locally generated bindings towards the node 
where the variables were created. In contrast, the propagation rule allows us 
to propagate the bindings in the opposite direction, from a variable’s home 
and-box to where it is consumed. 

4. Splitting (non-determinate promotion): distributes a conjunction across a 
disjunction similar to the forking rule of the original EAM: 



[3A : cr & A & {Gi V . . . V a V . . . V G„} & B] ^ 

{[3X : akAk{Gi}kB]y[3X : ct & A &{GiV. . .VGi-iVQ+iV. . .VG„}&B]} 

We say that this operation is non-deterministic because it allows more com- 
putation by extending the scope of disjunctions. We have changed Warren’s 
original splitting rule in order to support pruning: we do not merge the split 
and-box with the parent and-box (represented as Gi). The merge will be per- 
formed later by an application of the promotion rule to the and-box Gj. We 
thus implement pruning without the overhead of an extra data-structure. 
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The BEAM also supports several simplication and optimisation rules. The rules 
include support for success and failure propagation, and for tail-recursive exe- 
cution of deterministic procedures. 



2.2 Control for the BEAM 

The previous rules provided the basic machinery for the correct execution of logic 
programs. To this machinery we must add control strategies. We use a default 
strategy with implicit control according to David Warren’s original design in the 
BEAM: 

— Reduction only applies to goals G such that: 

1. the new or-box has a single branch (deterministic computation), or, 

2. the and-box for G does not hold bindings on external variables. 

In other words, we always reduce deterministic goals or goals in boxes that 
do not bind external variables. 

— Promotion and propagation is always allowed. 

— Splitting is allowed if no other rules apply. 

Deciding when to apply the splitting rule is an important issue for EAM 
implementations. We have considered three major extensions to the default rule: 

— Declare predicates as producers and allow them to perform Eager-Forking, 
that is, to do splitting as soon as they are called. The idea was first proposed 
by Gupta and Warren [9] . Intuitively, we declare goals to be producers if we 
expect them to produce, but not consume, bindings from other goals. 
Eager-forking is dangerous, because producers may in fact benefit from bind- 
ings generated by other goals, but is also attractive, because it is simple to 
understand, can increase parallelism, and in a few cases actually improve 
the search space. We have implemented eager-forking on the BEAM and we 
discuss some results in section 5. 

— Scope non-deterministic promotions: the user specifies a limit up to which 
we should check for splitting. The idea has appeared in many guises: mini- 
scopes in Gopal and Warren [9], independent computations in Bueno and 
Hermenegildo [2] . Scoping is also quite important for a parallel implementa- 
tion. 

— The right-hand side of a sequential conjunction can only be evaluated after 
the left-and-side has succeeded. We only use the sequential conjunction at 
the top-level, in order to guarantee correct ordering for side-effects builtins 
such as read/1 or write/1 [22]. 

Our approach contrasts with AKL where language constructs are used to se- 
quence execution of clauses [11]. 
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3 Implementing the BEAM 

Our EAM implementation relies on two main components, shown as ovals in 

figure 1: 

And-Or-Tree Manager: This module applies the EAM rewriting rules to the 
existing and-boxes and or-boxes until it can start WAM-like execution for 
the selected goal. 

Emulator: This module executes WAM-like code to perform the reduction step. 
Unification code is similar to the WAM. For efficiency reasons the module 
also supports reduction followed by and-compression, that is, deterministic 
execution. Control instructions for deterministic execution follow a compila- 
tion scheme similar to the WAM. 




Fig. 1. Execution Model. 



The design and implementation of the emulator and of the manager are 
discussed in more detail in [13]. The system relies on current Prolog compilation 
technology and on the techniques developed for concurrent logic programming 
languages. The CodeSpace stores the database in much the same way as for 
Prolog. The GlobaLMemory stores the and-or-tree and further subdivides into 
the Heap and the Box-Memory. The Box-Memory stores boxes and variables. 
The Heap holds Prolog terms lists and structures. The Heap uses term copying 
to store compound terms and is thus very similar to the WAM’s Heap, with the 
diference that on BEAM Heap memory can not be recovered after backtracking. 
A Garbage Golleetor is thus necessary to recover space in this area (see [14]). 

3.1 The Abstract Machine 

Registers. The internal state of the BEAM abstract machine is characterized by 
the following registers: PC - Program Counter; H - Top of Heap; S - Structure 
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pointer; Mode - controls whether unification is in read or write mode; XI ,X2, . . . 
- registers for Temporary Variables, also used as arguments registers. The new 
registers in the BEAM are OBX, that points to the current or-box; ABX, that points 
to the current and-box; and SU, that points to the start of the suspensions list. 



Or- and And-hoxes. Or-boxes are used to index the alternative clauses for a 
goal. And-boxes correspond to a clause and are kept connected to the parent or- 
box. Furthermore, and-boxes include, among other fields, a set of flags to store 
relevant information to guide the computation, a list of local variables, a list 
of locally bound external variables, and a list of suspensions on those external 
variables. 



Abstract Machine Instructions. Intermediate code for the BEAM abstract ma- 
chine follows closely the WAM style. The BEAM instructions use the WAM 
get, put and unify instructions. But, permanent variable slots, or Y-slots, cor- 
respond to the and-box local variables. BEAM implements the following novel 
control instructions: 

explore_alternative i : explore the ith alternative within the current or-box. 
prepare_calls n : creates an and-box with n subgoals. For each subgoal we 
record the entry point of the start code for the corresponding call, and ini- 
tialize it as READY to mean that it is ready to be explored. Execution is then 
passed to the And-Or-Tree Manager, through the next_call port, that will 
decide which call to execute. 

call pred : creates one or-box with n branches, where n is the number of alter- 
natives to pred). For each branch we record the entry point for the starting 
code of the corresponding alternative, and initialize it as READY to mean that 
it is ready to be explored. Execution is then passed to the And-Or-Tree Man- 
ager, through the next_alternative port, that will decide which alternative 
to execute. 

proceed this instruction returns control from a clause to the And-Or-Tree Man- 
ager. If the and-box does not have any external variables, then the and-box 
has succeeded and the execution proceeds to the success entry port of the 
And-Or-Tree Manager. Otherwise, the and-box is marked as suspended, and 
execution enters the suspend port in the And-Or-Tree Manager. 

Compiling Prolog clauses to the BEAM abstract machine instruction is very 
similar to WAM compilation. The main difference is that unlike in the WAM, 
code for rules in the BEAM do not end with a proceed instruction. In its cur- 
rent version, the BEAM abstract machine is goal based. The prepare_calls 
instruction creates an and-box with as many branches as calls, and initializes 
each branch to point to the start code of each call. It is up to the And-Or-Tree 
Manager to decide how and when to execute the calls. 
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3.2 The And-Or-Tree Manager 

The And-Or-Tree manager is the heart of our system. Its task is to decide which 
rewrite rule should be applied to the current tree, and then execute it. The main 
rewrite rules are, the Reduction rule, the Promotion rule, and the Splitting rule. 

The And-Or-Tree Manager can be divided into eight different modules. 

success : this module marks the current or-box as successful in its parent. The 
memory of the or-box is released. The parent and-box is checked and, if all 
calls have already reached success, the module is reentered for the upper 
or-box (Success Propagation). Otherwise, execution enters the next_call 
module. 

fail : this routine marks the current and-box as failed in its parent. All the 
assignments made by the and-box are removed, and space for the and-box 
is reclaimed. If all the alternatives for the parent or-box have failed, the 
module is recursively called for the parent and-box {Failure Propagation). 
Otherwise, if there is only one more alternative, execution moves to the 
unique_alternative. If there are several alternatives, execution continues 
to next_alternative. 

suspend : this routine adds the and-box to the current suspension list. Next, 
the routine clears all assignments saved in the list of the external variables. 
Each external variable is also added to the suspension list included in the 
respective local variable. After that, the routine jumps to the next-alternative 
next_call : this module searches for the next non-suspended call in the cur- 
rent and-box. If no call is ready in the current and-box, execution moves to 
next-alternative. Otherwise, the PC is set to point to the call’s code, and the 
execution jumps to the abstract machine emulator. 
next_alternative : this routine searches for the next non-suspended alternative 
in the current or-box. If there is no such alternative, execution jumps to 
next_call. Otherwise, if the alternative is in the WAKE state, execution moves 
to wake, else execution sets the PC to the code for the alternative code, and 
enters the alternative. 

unique_alternative : this module promotes the current and-box, since its par- 
ent or-box has a single alternative. All external variables are checked, since 
after the promotion some external variables may have become local. If during 
the promotion of external variables some unifications fail, execution moves to 
fail. If external variables still exist after the promotion, the and-box contin- 
ues suspended, and execution moves to next_call. Otherwise, if the and-box 
has suspended on end, execution can move to success; if goals are still left 
for running, the manager marks the and-box as running and continues its 
execution. 

wake : this module wakes a suspended and-box. All external variables are checked 
to look for changes. If an unification fails, execution jumps to fail. If ex- 
ternal variables still exist, execution continues to select_work. If no more 
external variables are left, and the and-box has suspended on end, it moves 
immediately to success, otherwise, it marks the and-box as running and 
continues its execution. 
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select_work : this module looks for work in the suspension list. If there is no 
more work available in the suspension list, the execution can end. Otherwise, 
the ABX register is set to point to the next suspended box. If the current 
and-box is the single alternative in its parent or-box, execution moves to 
unique_alternative. If there exist several alternatives, a Splitting is per- 
formed in the and-box. After the fork, one of the resulting and-box is waken, 
and its execution is restarted in wake. 



4 The Implementation of Splitting 

The split operation is executed whenever no deterministic operation is applica- 
ble. In other words, the splitting rule is only allowed when all leaf boxes in the 
tree are suspended, probably due to some external variable assignment. 

Splitting is the most complex rewrite operation on BEAM, and our imple- 
mentation of the splitting rule is better explained through a small example: 

start:- a(X) , b(...). a(first) . a(second) . 





(a) 



(b) 



Fig. 2. Example of Splitting. 



Figure 2a presents the BEAM execution state for this example. We assume 
that the computation has suspended for both calls a(X) and b( . . . ) . Note that 
the and-boxes (5) and (6) have tried to assign first and second, respectively, 
to X (local variable structure (3)), forcing both and-boxes to suspend. Both and- 
boxes are therefore in the suspension list of the local variable X. 

Figure 2b presents the BEAM execution state after executing the splitting. 
The algorithm starts by creating a new or-box (8). This or-box is initialized as 
having a single alternative. The leftmost suspended and-box (5) is then set as the 
unique alternative in this recently created or-box. The branch of this and-box 
on the older or-box (4) is removed. 
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Replicating the and-box a(X),b(X) is the key step in the algorithm: we 
need to replicate the whole and-or tree rooted at a and to do so we must 
traverse the whole tree rooted at and-box (2). This is performed through the 
replicate^ndbox routine The copy (9;10) is identical to the original except for 
the split or-box, which was already created in the previous step. Copying a sub- 
tree is a question of recursively copying or-boxes and and-boxes. The complex 
part arises when copying variables and terms. Our algorithm does as follows: 

1. For all new and-boxes (9), a vector with new local variables is created (11). 
During this routine, old local variables are temporarily set to point to the 
newly created local variables so that references to the local variables in terms 
always refer to the new local variables. 

2. All external variable structures available on the and-box (9) and children 
must be set to point to the new local variables, created in the previous step. 
For example, the external variable structure in (5) is now set to point to the 
newly created local variable structure (11). The suspensions list on the local 
variables must also be reset. 

3. We must refresh the suspensions lists on the local variables. For example, 
the suspensions list for the local variables structure (3) of figure 2(a), had 
two elements, the and-box (6) and (5). After the fork, the variable structure 
(3) was set only to point the and-box (6). The and-box (5) is now included 
in the suspension list of the newly created local variable (11). 

4. Or-boxes are replicated through the replicate_orbox routine. This routine 
essentially creates a new or-box and then for each alternative replicates the 
children and-boxes. 

The recently created subtree (9) is then added to the top or-box (1) as a new 
branch representing a new alternative, to the left of the older and-box (2). 

After applying the splitting rule, some boxes are marked as waken (5;6) as 
they may be ready for promotion. In the example, both and-boxes (5) and (6) 
will be promoted and will have their parent local variable X assigned to first 
and second respectively. The computation of b(. .) on (10) and (7) will then 
be ready to continue. 

Our implementation of splitting is in fact quite similar to AKL’s. There are 
two major differences: 

— we do not copy all terms as the AKL does. BEAM only copies the terms that 
have unbounded variables. Structures that are fully instantiated are shared 
between the old subtree and the new one. 

— we do not copy the child and-box where the splitting is being performed. 
Instead the and-box is moved directly to a new branch in the new subtree. 
Although all variables are checked so that references to the old subtree are 
changed to the new subtree. In the example the and-box (5) is not replicated 
during the splitting but it must reset its external reference to the new variable 
A (11). 
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5 Evaluating BEAM 

This section presents the performance results of the current BEAM system and 
compares them with the fastest version of SICStus Prolog and YAP [7] an emu- 
lated Prolog system. The BEAM was implemented on top of the YAP98 (YAP 
4.3 is a more recent version). 

The timings were measured running the benchmarks on a Pentium III 
450MHz with 256MB RAM running Mandrake Linux 7.1. The BEAM was config- 
ured with 64Mb to the HEAP plus 32Mb to the Box Memory. The benchmarks are 
available from the first author home page at http : / / www . ncc . up . pt/~rslopes. 
For each benchmark we present the number of non-deterministic promotions and 
the number of invocations to the garbage collector for BEAM. The runtime is 
presented for all systems in milliseconds. The timings were the best from ten 
runs. 
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Table 1. Deterministic Benchmarks (Time in milliseconds) 



Table 1 shows how the BEAM performs versus Andorra-I, AKL, and sev- 
eral Prolog systems for deterministic applications. Neither the BEAM nor AKL 
perform splitting, and Andorra-I executes determinately. Prolog systems may 
create choice-points. The merge application uses cut in Prolog and commit in 
the BEAM, AKL and Andorra-I. 

The YAP and SICStus Prolog systems are recognised as the fastest Prolog 
systems on the x86 architectures. The difference between Yap98 (on which the 
BEAM is based) and Yap4.3 shows that there is scope for improvement even for 
Prolog systems. These improvements should also benefit the BEAM. 

Comparing with the BEAM, Yap4.3 is between 2 and 5 times faster than 
the BEAM. SICStus Prolog and Yap98 are a bit less fast. This is quite a good 
result for the BEAM, considering the extra complexity of the Extended Andorra 
Model. The ratios between the BEAM and Andorra-I seem to depend on the 
determinacy code. The more sophisticated determinacy code in Andorra-I can 
limit overheads that the BEAM goes through in creating unnecessary and-boxes. 
The BEAM tends to perform better than the AKL on tail-recursive computa- 
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tions. We believe this is because the BEAM has special rules for performing tail- 
recursive computation that avoid creating intermediate or-boxes and and-boxes. 
The AKL is faster on the merge benchmark that uses the commit operator, and 
the BEAM is a tad faster on the kkqueens benchmark. Concluding, the BEAM 
seems to be somewhat trimmer than the AKL, but it still has greater overheads 
than Andorra-I. Determinacy optimisation, a better implementation of commit, 
and integrating with Yap4.3 should further improve BEAM performance. 

Comparing these systems for non-determistic benchmarks is hard, because 
the search spaces may be quite different for Prolog, BEAM, AKL, and Andorra-I. 
We will consider two classes of non-deterministic applications. First, we consider 
applications where the Andorra Model does not provide huge improvements to 
the search space. Note that in general one would not be terribly interested in 
the BEAM for these applications: first, splitting is very expensive and second, 
or-parallelism can also be quite effectively exploited in Prolog. We will next 
consider examples where the Andorra rule reduces, very significantly, the search 
space. 

Table 2 shows a set of four non-deterministic benchmarks. We consider two 
versions of the BEAM. As explained before, with eager splitting, ES, splitting 
on producer goals is performed immediately. These makes our computation rule 
closer to that of Prolog. By default, splitting is delayed until no other rules are 
applied. 



Benchs. 


BEAIV 


A 

ES 


And-I. 


AKL 


YAP 
98 4.3 


SICStus 

3.8.2 


ancestor 


NA 


0.60 (30) 


0.53 


1.61(73) 


0.08 


0.06 


0.14 


houses 


18.7 (49) 


8.0 (68) 


4.2 


26.0 (236) 


1.3 


1.0 


2.1 


query 


13.41 (624) 


9.0 (624) 


4.71 


87 (624) 


2.15 


0.86 


2.14 


zebra 


700 (1,743) 


50 (294) 


123 


108 (493) 


32 


26 


44 


puzzle4x4 


4,490 (53,350) 


- 


3,680 


4,990 (53.350) 


820 


580 


750 



Table 2. Nondeterministic Benchmarks (splits shown for BEAM and AKL) 



The zebra benchmark is a first demonstration of the impact of eager-splitting 
in the EAM. In this example just defining the producer dramatically cuts the 
search space and leads to a search space similar to AKL’s. Moreover, the good 
execution times when compared with Prolog indicate that the system is now 
actually reducing the search space. Producers can also avoid a situation where 
the EAM may loop, as shown in the ancestor benchmark [9]. 

The main benefit of the BEAM is in applications where we can significantly 
improve the search space. Such applications may be pure logic programs, or may 
be applications that take advantage of the concurrency inherent to the Andorra 
Model. We consider two examples. The s end-more-money benchmarks is a well- 
known example of a declarative program that performs badly in Prolog. A set of 
Prolog benchmarks would not be complete without experimenting with a naive 
solution for the queens problem. 





A Novel Implementation of the Extended Andorra Model 211 



Benchs. 


BEAM 


And-I 


AKL 


YAP 4.3 


sendjnoney 


30 


(277) 


23 


70 (109) 


38,520 


queens-9 


30 


(129) 


4 


40 


70 


queens-10 


120 


(364) 


10 


160 


540 


queens- 11 


80 


(212) 


6 


110 


4,140 


queens-12 


520 


(1,109) 


30 


660 


41,340 


queens-13 


250 


(522) 


16 


330 


422,790 


queens- 14 


6,270 


(9,046) 


287 


6,910 


5,383,760 


queens-15 


5,390 


(7,054) 


233 


6,510 


>20 hours 


queens-16 


48,550 


(52,617) 


1,860 


51,550 


>36 hours 


queens-17 


32,180 


(31,210) 


1,109 


35,640 




queens-18 


285,610 


(236,172) 


8,954 


284,250 




queens-19 


20,840 


(16,178) 


593 


22,550 




queens-20 


1,891,500 


(1,229,355) 


50,449 


1,816,270] 





Table 3. Reduced Search Benchmarks 



Results are shown in table 3. The send-more-money benchmark is quite 
interesting because although the EAM performs more splits than AKL, it has 
better performance, in fact similar to Andorra-I. Performance is three order of 
magnitude faster than Prolog’s. 

We used queens as a test on the scalability of the three systems. In this case 
the BEAM and AKL perform exactly the same number of splits and have very 
similar performance. The AKL gains some ground for larger sized benchmarks. 
In this case, performance largely depends on the memory management sub- 
system [14]: how often we do garbage collection and how efficiently garbage 
collection is implemented. We discuss the problem in more detail next. The 
benchmark is also interesting in that it shows a situation where the more Prolog- 
like Andorra-I actually has the best search space and obtains the best results. 

The previous discussion shows how important memory behaviour is for larger 
applications. In order to have a better understanding of this problem. Table 4 
presents the results of BEAM running the queens benchmark with and without 
garbage collector and with different memory configurations. The time is pre- 
sented in seconds, and the number of invocations to the garbage collector is 
presented in brackets. We chose the queens benchmark for a deeper analysis 
because (i) it is a well-known benchmark, (ii) we can easily vary problem size 
by increasing the number of queens; and (iii) it is an example where the BEAM 
can outperfom Prolog by limiting the search space. 

6 Conclusions and Future Work 

We have presented the design and the implementation of the BEAM, a system 
for the efficient execution of logic programs based on David H. D. Warren’s work 
on the Extended Andorra Model with implicit control. Our work was motivated 
by our interest in studying how the EAM with Implicit Control can be effec- 
tively implemented and how it can perform versus other execution strategies. 
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MEM config 
Heap -|- Boxed 


queens 

Time 


=9 (all) 

GC 


queens- 

Time 


15 (1st) 

GC 


queens- 

Time 


16 (1st) 

GC 


1MB -t 1MB 


7.98 


(272) 


7.98 


(317) 


78.41 


(3176) 


2MB -t 2MB 


7.49 


(132) 


6.54 


(136) 


60.88 


(1284) 


4MB -1- 2MB 


7.16 


(65) 


5.95 


(63) 


54.15 


(585) 


SMB -1- 4MB 


7.17 


(32) 


5.72 


(30) 


51.14 


(280) 


16MB -t SMB 


7.25 


(16) 


5.59 


(15) 


49.86 


(137) 


32MB -t 16MB 


7.29 


(8) 


5.49 


(7) 


49.20 


(67) 


64MB -t 32MB 


7.24 


(4) 


5.39 


(3) 


48.55 


(33) 


12SMB + 32MB 


7.00 


(1) 


5.41 


(1) 


48.65 


(16) 


No GC (128-t32MB) 


6.97 


NA 


5.35 


NA 


not enough mem 



Table 4. BEAM with different memory configurations (time in seconds). 



We believe that our first results are quite promising. The model performs well, 
even when just using implicit control. Moreover, simple programmer annotations 
and the use of pruning operators can often lead to better result than explicitly 
annotating the whole program. 

The BEAM opens up a design space of novel mechanisms for improving the 
control of (constraint) logic programs. We are researching this space into pro- 
viding an extensive framework for the effective execution of logic programs. As 
a result we have recently proposed the LIGHT-BEAM model, which addresses 
two major limitations of the BEAM by including support for tabling and for 
scoping of computing. 

We are also pursuing work in the parallel execution of Extended Andorra 
Model programs. We recently proposed the RAINBOW [15], a parallel frame- 
work for the evaluation of logic programs. The RAINBOW addresses the control 
issues and the data-structures required for parallel execution in a shared-memory 
context, but with a view for distributed shared memory execution. 
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Abstract. The lion’s share of datalog features have been incorporated 
into the SQL3 standard proposal. However, most SQL manuals still rec- 
ommend to implement user-defined conditions for data integrity non- 
declaratively, by triggers or stored procedures. We describe how to im- 
plement known declarative database technology for integrity checking in 
SQL databases. We show how to represent and evaluate arbitrarily com- 
plex constraints in SQL without incurring major disadvantages usually 
associated to integrity checking in large databases. Error-prone proce- 
dural specification and laborious maintenance of integrity constraints is 
avoided by the declarativity of the specification language. The costs of 
evaluation is considerably reduced by an automated translation of declar- 
ative specifications to SQL triggers. 



Introduction 

Integrity checking has always been an important issue in database manage- 
ment systems. Thorough theoretical methodologies have been devised in [Ni] 
and many others (cf. [D4]). In practice, however, most database management 
systems (DBMS) are contented with supporting only value constraints, unique- 
ness constraints and referential integrity in a declarative manner. For less simple 
user-defined constraints, there usually are two choices of expression: either as 
SQL conditions in CHECK clauses of CREATE TABLE statements, or as triggers 
which fire upon update events. Both options tie constraints dynamically to up- 
dates of specific table columns, which compromises the declarativity of constraint 
specification. The first option has the additional disadvantage of neglecting a 
possibly large potential of simplification which would speed up the evaluation of 
constraints. The second option has further disadvantages of procedurality, to be 
addressed in section 1. In this paper, we describe how to combine the advantages 
of declarativity of specification and efficiency of execution, without incurring the 
disadvantages of CHECK clauses and procedural trigger specification. 

In [Dl], we have generalized the approach to simplified integrity checking 
in relational databases in [Ni] to the deductive case. The resulting method was 
called soundcheck. Similar methods were proposed by many other authors (cf. 
[CG-k] [D4]). Orthogonal approaches in terms of conjunctive query optimization 
have been discussed in [El] [LS] [GS-k] [RSS] and others. Common to all of them 
is the declarativity of integrity constraint specification, as opposed to the pro- 
cedurality of specification and evaluation in commercial DBMSs. 
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Unfortunately, none of the Prolog implementations of soundcheck in various 
prototype systems ever went commercial. After all, the language of choice on the 
database market has not been Prolog, but SQL. Many theoretical visions and 
achievements in logic databases have found their way into SQL database systems. 
However, beyond commonplace kinds of constraints, theoretical approaches to 
declarative integrity checking have never really made it into practice. Rather 
than trying to explain why that is so, this paper sets out to show how the 
soundcheck approach can be translated to a practical SQL format. 

In section 1, we recapitulate common principles of simplifying integrity check- 
ing which will accompany us through the rest of the paper. In section 2, we define 
and discuss a syntax for integrity constraints which is sufficiently expressive and 
lends itself well toward an easy translation into SQL conditions. In section 3, 
we describe such a translation. In section 4, we use SQL system predicates for 
translating integrity constraints into “update constraints”, which are relevant 
and specialized for specific update patterns. Using translations described in sec- 
tions 3 and 4, we show in section 5 how integrity constraints can be translated 
into equivalent but more efficient triggers. In section 6, we illustrate some fur- 
ther possible optimizations. In the conclusion, we summarize the paper, address 
related work and point out directions for future work. For simplicity, we assume 
that updates are single tuple insertions or deletions. However, an extension to 
more general transactions is easily conceived. 

1 Principles of Integrity Checking in Logic Databases 

In this section, we outline the six phases of the soundcheck approach for improv- 
ing the efficiency of integrity checking. All or part of this approach is effectively 
used in one way or another (possibly with different sequencing or interleaving of 
phases) in most known theoretical approaches to integrity checking. In sections 

2 — 6, we will show how it can also be applied in practice to SQL databases. The 
six phases are captioned below. A subsequent example illustrates their meaning. 

I Generate difference between old and new state 

II Skip idle updates 

III Identify relevant integrity constraints 

IV Specialize relevant constraints 

V Optimize specialized constraints 

VI Evaluate optimized constraints 

For illustrating phases I— VI, consider an update of an SQL database with 
relations for workers (wkr) and managers (mgr), defined as follows. 

CREATE TABLE(wfcr(CHAR[] name, CHAR[] department, DATE start)) 

CREATE TABLE (m 5 r(CHAR[] name)). 

The attribute start is supposed to contain the date when the worker was em- 
ployed, all other attributes are self-explaining. Now, suppose there is an integrity 
constraint requiring that no worker be a manager. That can be expressed by the 
SQL condition 




216 



Hendrik Decker 



NOT EXISTS (SELECT * FROM wkr, m(/r WHERE wkr. name = mgr. name). 

If the number of workers and managers is large (e.g., in the database of a large 
company with tens or hundreds of thousands of workers and a huge management 
hierarchy), then checking this constraint can be very costly. The number of facts 
to be retrieved from the two relations is in the order of the size of their cartesian 
product. However, we are going to see that the frequency and the amount of 
accessing stored facts can be significantly reduced by taking steps I - VI. 

At this point, SQL programmers might feel compelled to point out that the 
constraint above is probably much easier checked by a trigger such as 

CREATE TRIGGER ON wkr FOR INSERT : 

IF EXISTS (SELECT * FROM inserted, mgr WHERE inserted. name = mgr. name) 
ROLLBACK 

which requires evaluation only for each attempt to insert a row into wkr, 
where only the stored relation mgr and the singleton cached relation inserted of 
rows to be inserted need to be accessed, but not the stored part of wkr. And 
indeed, the translations described in sections 3—5 automatically produce the 
trigger above, so that the user does not have to program it. 

In general, things are less easy than this example might suggest. Integrity 
constraints can be much more complicated, and triggers may bring about un- 
foreseen effects that are hard to control. For instance, it is easily overlooked 
that the integrity constraint above is “symmetric” for wkr and mgr, since it also 
requires implicitly that somebody who is promoted to manager is not a worker, 
thus necessitating a second trigger for insertions into mgr. If only a single trigger 
for wkr or mgr is programmed, then updates of the other relation which violate 
the constraint will go unnoticed. However, the second trigger is also produced 
automatically by the translations described in sections 3—5. 

Now, back to the six phases. Let INSERT wkr{Fred, sales, 2001-01-01) be an 
update. Then, going from I through VI means the following: 

I) In case there are database views the definition of which involves wkr, the 
explicit update INSERT wkr{Fred, sales, 2001-01-01) may have implicit update 
consequences on such views. Thus, for each such consequence, all steps in phases 
II - VI need to be considered. For example, suppose there is a view pension, con- 
taining all workers entitled to obtain a pension, (e.g., if the total time they have 
worked for the company sums up to a limit number of years) and a constraint 
on that view (e.g., expressing an exceptional condition under which pension is 
not granted). Then, that constraint needs to be evaluated if and only if inserting 
wkr(Fred) implies pension(Fred) . 

In general, phase I is needed only in deductive databases or relational data- 
bases with view definitions. Otherwise, it can be ignored. In this paper, we do 
not consider relations defined by views. 

II) If Fred already has been a worker (e.g., in some other department) before 
the INSERT statement was launched, then it clearly is not necessary to check 
again the constraint that he must not be a manager, because it has already been 
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known to the database that Fred is no manager (since he has been a worker and 
because the constraint has been satisfied in the previous database state). 

In practice, however, checking the idleness of an update request may some- 
times be hardly less costly than just checking a related constraint. Thus, for 
simplicity, we do not work out in this paper how phase II could be translated 
into SQL. 

III) Unless II applies, the constraint that no worker must be a manager is 
clearly relevant and must be checked. However, any integrity constraint which 
does not involve wkr needs not be checked. More precisely, each constraint which 
is not relevant for the insertion of rows into wkr needs not be checked. For 
instance, a constraint which requires that in each department, there must be at 
least 10 workers, is not relevant for insertions but only for deletions in the wkr 
table. We describe how to identify relevant constraint in general in 4.1. 

IV) For the given INSERT statement, the WHERE clause of 

EXISTS (SELECT * FROM wkr, mgr WHERE wkr. name = mgr. name) 

can be specialized to a much less expensive form: 

EXISTS (SELECT * FROM wkr, mgr 

WHERE wkr.name = Fred AND wkr. name = mgr. name) . 

Specializing constraints in general is discussed in 4.2. 

V) Clearly, the specialized condition in IV can be optimized to the statement 

EXISTS (SELECT * FROM mgr WHERE name = Fred) . 

VI) After having gone through I to V, evaluation of the resulting query 
whether Fred is a manager is easy. Looking for a single fact in a stored relation is 
of course much less costly than having to evaluate the original integrity constraint 
in its full generality (not to mention other constraints that might be unnecessarily 
checked if phase III has been ignored). 

The example above is a very simple one. In general, the treatment of con- 
straints with nested quantification and negation is quite intricate; even referential 
integrity is more involved (cf. section 6, example 4). However, we are going to 
see that the same proportions of simplification and reduction of necessary work 
can be obtained systematically for arbitrarily complex integrity constraints. 



2 Representing First-Order Sentences in Range Form 

Integrity constraints are expressed by first-order predicate calculus sentences 
which obey the well-known range-restricted property. A slight extension of its 
definition in [Dl] is used in 2.1, along the lines of [VT], to also cover built-in 
predicates. For enabling an easily automated translation of first-order sentences 
into logically equivalent SQL conditions, we define in 2.1 a syntactic pattern 
called range form. For a variant of the range form in 2.1, similar translations 
have been specified in [Dl] and [VT]. The range form in [Dl] had been developed 
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for supporting an efficient evaluation of arbitrary integrity constraints by Prolog 
interpreters. The range form in 2.1 lends itself particularly well toward evaluation 
with an SQL engine. The syntax defined in 2.1 is discussed in 2.2. 

2.1 The Syntax of the Range Form 

The connectors to be used are conjunction A, disjunction V and negation the 
only quantifier is 3. In the BNF rules below, we distinguish built-in system pred- 
icates (such as =, yf, <, <, etc) from user-defined predicates (which correspond 
to SQL relations declared by CREATE TABLE clauses). In general, system predi- 
cates are not updatable, while user-defined ones are. For compatibility with SQL, 
we implicitly assume that each term occurring as an argument of a predicate is 
appropriately typed. We denote identity of formulas by =. 

RF ::= RF A RF \ RF V RF where extra brackets may be used to establish 

or override connector precedences. 

RF ::= FRF \ NRF where FRF and NRF stand for positive and, 

resp., negative range form. An expansion of RF 
by FRF or NRF is called top-level range form. 

NRF ::= ^FRF 

FRF ::= 3X (Range{X)) where, for X and Range{X), the same as in the 

following rule for FRF applies. 

FRF ::= 3X{Range{X) A SF) where A is a vector of m distinct variables 

(m > 1), Range(X) is a conjunction of n 
positive literals (n > 1), each with a user- 
defined predicate of arity >1, the sum of the 
n arities is m, each of their arguments is a 
variable, and each variable in X occurs in 
Range(X). 

SF ::= SF A SF \ SF V SF where extra brackets may be used to establish 

or override precedences of connectors. 

SF ::= FRF \ NRF \ LS where LS is a literal with a system predicate, 

and each variable in LS must be covered by 
some range expression, in virtue of the 
definition of FRF. 

The term Range(X) in the definition of FRF is called range expression. For a 
variable a; in A, each occurrence of x in the subformula SF is said to be covered 
by Range{X). For convenience, we may represent a variable in A which does not 
occur in SF as an anonymous variable. 
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2.2 Discussion of the Range Form 

It can be shown that the syntax in 2.1 shares all essential advantages of the 
range form in [Dl] [VT]. In fact, its evaluation is even more efficient than the 
latter, since 2.1 permits minimization of the scope of quantifiers, thus avoiding 
multiple occurrences of literals due to the distributivity of A and V. 

Note that no negative literals with user-defined predicate occur in the range 
form syntax of 2.1. However, there is no loss of generality, since a negative literal 
of form ~ p{ti, ..., tk) can be expressed equivalently by the negative range form 

~ 3xi,...,Xk{p{xi,...,Xk) Axi = ti A ... Axk = tk) 
where xi, ..., Xk are fresh variable symbols. In general, the syntax of 2.1 pre- 
scribes the use of equalities for expressing unification of variables among each 
other or with constant terms. For example, the range form of the formula 
\/x{p{x, x) q{x, c) A 3y{r{x, y))) is 

~ 3xi,X 2 (p{xi,X2) A = a;2 A 

(~ ( 9 ( 2 / 1 , 2 / 2 ) A yi = xi A y 2 = c) V ~ 3z(r(z,_) A z = a;i))) 

Also note that, in each formula which complies with the syntax in 2.1, at least 
one literal with user-defined predicate must occur in its range expression, and it 
must not contain any user-defined 0-place predicate. (Analogously, commercial 
DBMSs only admit constraints on user-defined tables, not on system tables, and 
do not permit the creation of tables without attributes.) As opposed to [Dl], the 
syntax in 2.1 permits the use of system predicates. 

It is easy to verify that 2.1 effectively requires each formula RF in range 
form to be closed and allowed [VT], i.e., each variable in RF occurs in the scope 
of an 3 quantifier and is covered by the range expression associated to that 
quantifier. Similar to related proofs in [Dl] ]VT], it can be shown that, except 
useless formulas without any user-defined predicate or with a user-defined 0-place 
predicate, each closed formula which is range-restricted [Dl] can be equivalently 
represented in the range form syntax of 2.1. Thus, the latter practically incurs 
no loss of generality. Along the lines of [Dl] [D2] ]VT], it can even be argued that, 
without a sophisticated syntactic device such as the range form, the evaluation 
of many integrity constraints would probably be much more complicated or even 
impossible (which might be one of the reasons why declarative integrity checking 
methods have not yet found their way into commercial DBMSs). 



2.3 Example: 3x\/y{r^ empl{y, sales) V sup{x,y)) 

This integrity constraint (let’s name it IC) expresses that there must be an 
individual x who is superior of all employees in the sales department. IC is not 
“safe” in the usual sense, but range-restricted, hence domain-independent ]Ni], 
hence always returns a boolean truth value, and thus can always be evaluated 
safely. A representation of IC in the range form of ]D1] is 



3a;(sup(a;, _) A \/y{empl{y, sales) sup{x,y))) V ~ 3y{empl{y, sales)). 
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A representation of IC in the range form according to 2.1 is 
3x{sup{x,.) A ~ 3yi,y2{empl{yi,y2) /\ V 2 = sales A 

~ 3 xi, X2{sup{xi, X2) A xi = X A X2 = yi)) V ~ 3y{empl{-,y) A y = sales). 



3 Translating Formulas in Range Form to SQL 

Before specifying the translation of formulas in range form into equivalent SQL 
conditions, in general, we are going to show the result of the translation of IC in 
2.3, as an illustration. We assume that relations empl and sup have been defined 
by appropriate CREATE TABLE clauses. For convenience, the argument in the 
t-th column of a relation rel be denoted by rel.i, from now on. 

EXISTS (SELECT * FROM sup si WHERE NOT EXISTS 
(SELECT * FROM empl WHERE empl.2=sales AND NOT EXISTS 
(SELECT * FROM sup WHERE s2.1=sl.l AND s2.2=empl.l))) 

OR NOT EXISTS (SELECT * FROM empl WHERE empl.2=sales) 

In general, multiple occurrences of relation names in SQL clauses need to 
be kept apart by postfixed alias names, as usual in SQL. In the example, two 
occurrences of sup are distinguished by aliases si and s2. For convenience, we 
may loosely speak later on of a “predicate” p (say) in an SQL clause when we 
really mean to identify the alias name of a particular occurrence of the relation 
name corresponding to p. 

The equations below for translating a formula RF in range form into an 
equivalent SQL condition sql{RF) recur on the BNF syntax with its symbols 
PRF, SF, LS in 2.1. For convenience, multiple occurrences of symbols RF and 
SF is some BNF rules of 2.1 are indexed. 

sql{RFi A RF 2 ) = sql(RFi) AND sql{RF 2 ) 

sql{RFi V RF 2 ) = sql(RFi) OR sql{RF 2 ) 

sql(^PRF) = NOT sql{PRF) 

sql(3X{Range{X))) = EXISTS (SELECT * FROM pi, ..., p„) 
where, for X and Range{X), the same as in the following case applies. 

sql{3X {Range{X) A SF) = 

EXISTS (SELECT * FROM pi, ...,p„ WHERE sql{SF)) 

where pi, ..., p„ are the predicates of the n literals in Range(X). Multiple 
occurrences of pi (1 < i < n) in range expressions of PRF (including nested 
ones in SF) need to be consistently postfixed with distinguished alias names in 
sql{RF) (which, for simplicity, is not denoted explicitly here). 

sql{SFl A SF2) = sql{SFl) AND sql{SF2) 

sql{SFl V SF2) = sql{SFl) OR sql{SF2) 

sql{LS) is defined as follows: By definition, each variable x in LS is covered 
and thus occurs in the z(a;)-th position (say) of a literal with some user-defined 
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predicate p in the range expression which covers x. Thus, sql{LS) is obtained by 
replacing each occurrence of x in LS with p{x).i{x) (which is well-defined since, 
for each variable in a formula in range form, its cover is unique) . 



4 Identifying and Specializing Relevant Constraints 

It is straightforward to verify that the evaluation of an integrity constraint IC 
precisely corresponds to evaluating the SQL query sql{lC). However, any poten- 
tial for improving the efficiency of evaluation according to phases III and IV 
(section I) still remains to be exploited. Precisely that is the purpose of this 
section. 

In 4.1, we outline a formalism of how to identify relevant constraints accord- 
ing to phase III. In 4.2, we describe how to specialize the formulas obtained in 
4.1 according to phase IV. In order to have sufficient syntactic ffexibility, we do 
not require a representation of integrity constraints in the range form syntax, at 
this stage (although we do not rule it out either). 



4.1 Identifying Relevant Constraints 

In this subsection, we elaborate on phase III (section 1). We are going to define a 
formalism which is targeted toward an easy translation into SQL triggers, by us- 
ing system predicates that are usual for programming triggers in SQL databases. 

According to [Ni], an integrity constraint IC is potentially relevant for the 
insertion (resp., deletion) of a fact F, and thus needs to be checked, only if 
there is an atom with negative (resp., positive) polarity in IC which unifies 
with F . (An atom in a formula W is of negative, resp., positive polarity if, in 
the negation-innermost form of W, the atom is negated or, resp., not negated. 
The negation-innermost form is obtained by well-known equivalence-preserving 
transformations which right-shift each negation symbol as far as possible and 
eliminate double negation.) We recall that, according to [Ni], as many checks as 
there are matching occurrences in IC with the respective polarity are necessary, 
in general. 

For example, the constraint Vx (~ p{x,b) V ~ p(a,x)) clearly is potentially 
relevant for insertions of facts about p. More precisely, it is potentially violated 
by insertions of ground instances of p(x, b) and p{a, x). 

Similar to what is called “update constraints” in soundcheck [Dl], we are 
going to incorporate this rule of relevance into the constraints to be checked 
upon a given update. Similar to what is usual in SQL databases, we assume 
the existence of two distinguished predicates inserted and deleted, which are 
cached and not write-accessible to the database user. For a given update which 
requests the insertion or the deletion of some fact p(ci, ..., Cfe), where ci, ..., Cfe are 
constants, querying inserted or, resp., deleted returns the answer ci,...,Cfc. (As 
usual in SQL databases, the arity of inserted and deleted adapts to the arity of 
the updated relation.) With that, it is possible to incorporate the identification 
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of an integrity constraint which is relevant for a given update, into a formula to 
be evaluated at update time, as follows. 

Let IChe an integrity constraint, p an updatable predicate with arity k >1 
and p(ti, ..., tk) an occurrence of an atom in IC, where each term ti (1 < i < k) 
is either a constant or a variable. For convenience, let updated stand for inserted 
if the polarity of p{t\, ..., tk) is negative, and for deleted if it is positive. Further, 
for some h, 0 < h < k, let x\, ...,Xh be the variables among the U. Then, one of 
the two formulas (*) below identifies the relevance of /C with regard to requests 
for inserting or, resp., deleting facts which unify with p{t\, ...,tk)- If ft. = 0 (i.e., 
if there are no variables in p(ti, ...,tfe)), then 

(*) updated{ti, ...,tk) /\ ^ IC , else 
(*) 3x\, ...,Xfi{updated{ti, ...,tk)) ^ ^ IC . 

Note that the negation of IC is used in (*). That corresponds to the good 
practice in deductive databases of representing integrity constraints in denial 
form, as introduced in [SK] . Denial form is convenient for declaring what should 
not be the case, i.e., for stating conditions that should not hold. If, in a database, 
such a condition becomes true, it means that integrity is violated. Logically, 
an integrity constraint IC is equivalent to its denial representation <— ~ IC. 
Thus, the negation ~ IC in (*) is the condition which, when satisfied, signalizes 
integrity violation. So, if (*) returns true upon updating a ground instance of 
p{t \, ..., tk), then integrity is violated. Conversely, an evaluation of (*) in case IC 
is not relevant for a given update would immediately return false because of the 
cached conjunct on the left-hand side of (*), without evaluating IC. 

For the example above, the following two instances of the second of the 
formulas (*) are obtained. 

3x {updated (x,b)) A ~ Va;(~ p(a:, 6) V ^p(a,x)) 

3x(updated(a,x)) A ~ Va;(~ p(a:, ft) V ~p(a, a:)). 

4.2 Specializing Relevant Constraints 

In this subsection, we show how formulas of form (*) above can be specialized, 
in the sense of phase IV (section I). 

Again, let IC be an integrity constraint, p an updatable predicate and F a 
fact about p to be inserted or deleted, such that IC is relevant for this update 
request. That is, F matches with some occurrence of an atom p{t\, ...,tk) in IC 
with negative or, resp., positive polarity. Let updated stand for inserted if the 
polarity is negative, and for deleted if it is positive. Further, let (j) be an mgu of F 
and p{t\, ...,tk). According to [Ni], IC can then be specialized to ICO, where the 
substitution 9 is obtained from (j) by restricting the latter to those variables in 
p{t\, ...,tk) that are V-quantified in the negation-innermost form of IC without 
being dominated by an 3 (i.e., no 3 quantifier occurs on the left of Vx). Thus, 6 
grounds each such variable x to the corresponding constant value in F . 

Now, we are going to translate this principle of specialization to the con- 
ditions of form (*) (4.1). For convenience, let us designate a variable in IC as 
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'^-quantified when it is V-quantified and not dominated by an 3 in its negation- 
innermost form. Again, for some h, 0 < h < k, let x\,...,Xh be the variables 
among W.l.o.g., let, for some g, 0 < g < h, x\, ...,Xg be the '^-quantified 

variables, and Xg+i, ...,Xh the remaining variables in p{ti, ...,tk), if any. Fur- 
ther, let IC denote the formula obtained by dropping the quantifiers of each Xj 
(1 < j < g) in IC. (In general, a variable x which is '^-quantified in IC may be 
quantified by either one of V and 3 in IC, since the latter is not required to be 
in negation-innermost form.) Then, instead of costly conditions of form (*), it 
suffices to evaluate one of the following specialized conditions. If /i = 0, then 

(**) updated {ti, ...,tn) /\ ^ IC, else 

(**) 3xi,...,Xg{3xg+i,...,Xh{updated{ti,...,tk)) A ~ ZC() . 

Clearly, the first case is the same as (*), since there are no variables which 
could be specialized. But if (/ > 0, i.e., if there are '^-quantified variables, then 
an instantiation of the variables in updated with ground values of a fact to be 
inserted or, resp., deleted, also grounds each '^-quantified variable x\,...,Xg in 
IC. 

For convenience, let us call formulas of form (**) “update constraints”. More 
precisely, let IC be an integrity constraint, p a user-defined predicate and A an 
atom in IC with predicate p. Then, for each occurrence p{t\, ..., tk) (say) of A in 
IC, precisely one of the formulas (**) is obtained as described above, and that 
formula is called the update constraint of IC for p{t\, ...,tk). 

For the example in 4.1, the following two update constraints are obtained. 

3x {updated (x,b) A ~ (~ p(x, 6) V ^p(a,x))) 

3x(updated(a,x) A ~(~p(a;,5) V ^p(a,x))). 

Clearly, some easy optimizations of these update constraint formulas are 
possible. But we leave that to section 6, where phase V of section 1 is discussed. 

5 Translating Integrity Constraints to SQL Triggers 

In this section, we describe how the specialized relevant constraint formulas of 
form (**) in 4.2 are translated into SQL triggers. 

In general, we assume that SQL triggers are of the following form (which 
essentially is a common denominator of the usual appearance of triggers in com- 
mercial SQL database systems). 

CREATE TRIGGER ON [relation] FOR {INSERT | DELETE}: 

IF ([condition]) [action] 

where [relation] names the table which is updated by an insertion or deletion, 
resp.; [condition] is an SQL condition which, when its evaluation in the updated 
state returns true, signifies violation of integrity; [action] is a statement which, in 
practice, usually is a ROLLBACK command for re-installing the database state 
before the update attempt, and the output of a reject message. In principle, it 
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may also involve an explanation for the update failure, or even a repair action. 
However, we are not concerned in this paper about what a DBMS does when an 
update would lead to integrity violation. 

According to 4.2, it suffices to evaluate update constraints of form (**), for 
integrity checking upon the insertion or deletion of some fact which matches 
the occurrence of some atom in some integrity constraint. Thus, it suffices to 
have a suitable SQL representation of update constraints as conditions of SQL 
triggers, which are fired upon such updates. In section 3, we have described 
how to translate arbitrary (but range-restricted) conditions, represented as first- 
order predicate calculus sentences in range form, into equivalent SQL conditions. 
However, since we have not required range form syntax in section 4, the update 
constraints of form (**) must first be transformed into range form. 

For limiting the size of this paper, we are not going to specify at length how 
an arbitrary first-order sentence can be transformed into range form. However, 
it has been shown in [Dl] [D2] [VT] that each range-restricted sentence can be 
transformed into logically equivalent formulas that obey the range form syntax 
in [Dl]. Similar transformations can be worked out also for the range form syntax 
in 2.1. 

So, let us suppose that rf is a mapping which transforms a first-order pred- 
icate calculus sentence into a representation in the range form syntax of 2.1. 
Further, for an integrity constraint IC and the occurrence A of an atom with 
user-defined predicate p (say) in IC, let up{IC, A) denote the update constraint 
(**) obtained as described in section 4. Then, according to what we have seen 
in 4.2, triggers of the following form (one for each occurrence A of an updatable 
atom in I C) are sufficient for a sound integrity check of IC. 



(***) CREATE TRIGGER ON p FOR {INSERT | DELETE}: 
IF sql{rf{up{IC, A))) ROLLBACK 



In terms of integrity checking, such triggers are equivalent to the constraint, 
but more efficient than the evaluation of the constraint in its original form. 
Equivalence here is meant in the following sense. If an update does not violate 
a given constraint IC, then the execution of the update is not prohibited by any 
of the triggers for IC. Otherwise, at least one of the triggers is fired upon an 
attempt to execute the update, detects violation and rolls back the update. A 
formal proof of equivalence would essentially rely on the equivalence preservation 
of translations rf and sql. 

From what we have seen already in section 1, it should be obvious that the 
firing of these triggers, which is controlled by update attempts of facts which 
match A, is more efficient than evaluating each constraint for each update, or 
even only each relevant constraint in its original, non-specialized form. (Efficiency 
is measured in terms of the number of facts retrieved from stored relations, and 
the number of times that stored relations are accessed.) 
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6 Optimized Triggers 

Triggers of the form (***) (section 5) are already optimized in two ways. Firstly, 
they are made relevant only to particular update patterns, by which many unnec- 
essary checks of their conditions are filtered out. Secondly, they are specialized 
to the values of anticipated update patterns, already at constraint specification 
time. However, a suitable SQL query optimizer may recognize even more possibil- 
ities of optimizing them, such that their evaluation becomes even more efficient, 
in the sense of phase V (sect. 1). 

In general, optimization can take place already at an early stage, e.g., al- 
ready when update constraints of form (**) (4.2) are obtained, i.e., already at 
constraint specification time. However, at that stage, an optimizer for first-order 
predicate calculus sentences, such as the one described in [DI], would have to be 
used, rather than an SQL optimizer. 

We are going to present the result of some obvious optimizations of triggers 
of form (***) by some examples, below. 

Example 1: Va:(~ p{x, b)) V Va;(~ p{a, x)) 

This constraint translates into the two optimized triggers 
CREATE TRIGGER ON p FOR INSERT: 

IF (EXISTS (SELECT * FROM mserted WHERE inserted.2 = b) 

AND EXISTS (SELECT * FROM p WHERE p.l = a)) 

ROLLBACK 

CREATE TRIGGER ON p FOR INSERT: 

IF (EXISTS (SELECT * FROM mserted WHERE inserted.l = a) 

AND EXISTS (SELECT * FROM p WHERE p.2 = b)) 

ROLLBACK. 

Example 2: Va:(~ p(x, 6) V ~ p{a,x)) 

This constraint has already been mentioned in 4.1 and 4.2. It translates into the 
two optimized triggers 

CREATE TRIGGER ON p FOR INSERT: 

IF (EXISTS (SELECT * FROM mserted WHERE inserted.2 = b) 

AND EXISTS (SELECT * FROM p WHERE p.l = a AND p.2 = inserted.!)) 
ROLLBACK 

CREATE TRIGGER ON p FOR INSERT: 

IF (EXISTS (SELECT * FROM inserted WHERE inseHed.l = a) 

AND EXISTS (SELECT * FROM p WHERE p.l = inserted.2 AND p.2 = b)) 
ROLLBACK. 

Example 3: yx,y{^ p{x,y) V ~ p{y,x)) 

The translation of this constraint results in two triggers which turn out to coin- 
cide, yielding the single optimized trigger 
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CREATE TRIGGER ON p FOR INSERT: 

IF (EXISTS (SELECT * FROM inserted WHERE 

EXISTS (SELECT * FROM p WHERE p.l = inserted.2 AND p.2 = inseHed. 1))) 
ROLLBACK. 



Example 4: Vx, y(~ p{x, y) V 3z{p{y, z))) 

This constraint expresses a referential relationship between the second and first 
column of p (yet, without requiring that an additional uniqueness constraint 
be imposed on the referenced column, as usual in SQL databases for referential 
integrity). Its translation requires aliases pi and p2 for the two occurrences of p, 
yielding the optimized triggers 

CREATE TRIGGER ON p FOR INSERT: 

IF (EXISTS (SELECT * FROM inserted WHERE 

NOT EXISTS (SELECT * FROM p WHERE p.l = inserted.2))) 

ROLLBACK 

CREATE TRIGGER ON p FOR DELETE: 

IF (EXISTS (SELECT * FROM deleted WHERE 
EXISTS (SELECT * FROM p pi WHERE p 1 .2 = deleted.l AND 
NOT EXISTS (SELECT * FROM p p2 WHERE p2.1 = deleted.l)))) 
ROLLBACK. 



Example 5: 3x\/y{^ empl{y, sales) V sup{x,y) 

This constraint, taken from 2.3, translates into two triggers, one for sup and one 
for empl. The optimized trigger for sup is 

CREATE TRIGGER ON sup FOR DELETE: 

IF (NOT EXISTS (SELECT * FROM sup si WHERE NOT EXISTS 
(SELECT * FROM empl WHERE empl.2 = sales AND NOT EXISTS 
(SELECT * FROM sup s2 WHERE s2.1 = sl.l AND s2.2 = empl.l))) 

AND EXISTS (SELECT * FROM empl WHERE empl.2 = sales)) 

ROLLBACK. 

The optimized trigger for empl is 
CREATE TRIGGER ON empl FOR INSERT: 

IF (EXISTS (SELECT * FROM mserted WHERE inserted.2 = sales 

AND NOT EXISTS (SELECT * FROM sup si 

(SELECT * FROM empl WHERE empl.2 = sales AND NOT EXISTS 

(SELECT * FROM sup s2 WHERE s2.1 = sl.l AND s2.2 = empl.l))))) 

ROLLBACK. 
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Conclusion 

We have described how to translate the soundcheck methodology for integrity 
checking in logic databases to SQL. Our results provide the grounds for an 
implementation of this methodology in commercial DBMSs. We have specified 
a declarative syntax for expressing arbitrarily quantified integrity constraints, 
which lends itself well toward an efficient evaluation in SQL databases. Also, we 
have translated to SQL the approach in [Dl] to simplify integrity checking by 
specializing constraints to update values. To our knowledge, no such translations 
have yet been implemented in SQL databases. Rather, most commercial DBMS 
do not support declarative specifications of arbitrary constraints, but require 
hand-coding of procedural triggers or stored procedures. What has been done 
in terms of research in the framework of SQL amounts, in various ways, to a 
combination of disparate declarative and procedural mechanisms, e.g., [CPM] 
[MP] [LO]. As opposed to that, our approach is uniformly declarative. It does 
not sacrifice the advantages of efficiency that otherwise may only be obtained by 
compromising declarativity at an early stage. Evaluation of simplified triggers 
according to our approach is much less expensive than evaluating unsimplified 
SQL conditions, in terms of the costs of accessing stored facts. 

Further optimizations of specialized triggers in terms of the results in con- 
junctive query opimization may be investigated. Also, translating the approach 
of specializing integrity constraints on views in [LD] to SQL would be a tempt- 
ing challenge. We intend to translate known techniques (e.g., [D3]) for integrity- 
preserving view updating in logic databases to SQL, along the lines of this paper. 
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Abstract. We propose high-level type constructors for constraint pro- 
gramming languages, so that constraint satisfaction problems can be 
modelled in very expressive ways. We design a practical set constraint 
language, called ESRA, by incorporating these ideas on top of OPL. A set 
of rewrite rules achieves compilation from ESRA into OPL, yielding pro- 
grams that are often very similar to those that a human OPL modeller 
would (have to) write anyway, so that there is no loss in solving efficiency. 



1 Introduction 

Optimisation problems — where appropriate values for the problem variables 
must be found within their domains, subject to some constraints, such that 
some cost function on these variables takes an optimal value — are ubiquitous 
in industry. Examples are production planning subject to demand and resource 
availability so that profit is maximised, air traffic control subject to safety pro- 
tocols so that flight times are minimised, transportation scheduling subject to 
initial and final location of the goods and the transportation vehicles so that 
delivery time and fuel expenses are minimised, etc. A particular case are deci- 
sion problems, where there is no cost function that must take an optimal value. 
They are collectively known as constraint satisfaction problems (CSPs). Many of 
these problems can be declaratively expressed as constraint programs and then 
be solved using constraint solvers. 

However, effective constraint programming (or: modelling) is very difficult, 
even for application domain experts, and hence time-consuming. Moreover, many 
of these problems are ill-behaved, in the sense that it can be shown that solving 
them requires an amount of time that is worse than polynomial in the size of 
the input data, hence making solving times prohibitively long. 

To address the programming -time problem, ever more expressive and declara- 
tive constraint programming languages are being designed, providing traditional 
algebraic notations (such as sums and products over indexed expressions) and 
useful datatypes (such as sets, arrays, and enumerations) to enable a more nat- 
ural expression of the constraints, freeing the programmer thus more and more 
from traditional (and often low-level) computing obligations, such as the writing 
of iterative/recursive code or the encoding of concepts as numbers. 
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To address the solving-time problem, the default behaviour of the solver can 
be modified and implied constraints can be posted so as to reduce the search 
space. Such optional (but often necessary) practice is however a concession 
that fully declarative constraint programming is still far away, and the ques- 
tion whether procedural search statements can be automatically added upon 
analysis of the constraints remains essentially open (but see [7,10]). 

Concerns about the solving time also require trade-offs about expressiveness: 
the programming language must after all be executable (though need not be 
computationally complete) and its programs should ideally execute quickly (and 
finitely) . For instance, set constraint languages may well allow the formulation of 
constraints over sets (such as CLPS [1], CONJUNTO [8], np-SPEC [3], OZ [11], and 
{log} [4]), hence providing enormous expressiveness, but if they cannot be com- 
piled into acceptably fast code, then the advantage of decreased programming 
time is neutralised by the disadvantage of increased solving time. 

Starting from the very expressive (and fast) OPL (Optimisation Program- 
ming Language) [16], we here design an even more expressive (and equally fast) 
language, called esra, and show how it is compiled into OPL. Like OPL, the ESRA 
language is strongly typed, and a sugared version of what is essentially a first- 
order logic language. Unlike OPL, the ESRA language supports more advanced 
types, such as mappings, and allows variables of these types as well as of type 
set, making it a set constraint language. 

This paper is organised as follows. In Section 2, we present a motivating 
example. We can then introduce, in Section 3, the syntax of our ESRA language, 
as well as explain, in Section 4, the semantics of ESRA by showing how it is 
compiled into OPL. Finally, in Section 5, we conclude, compare with related 
work, and discuss our directions of future work. 



2 A Motivating Example 

We now argue that it is possible to improve the expressiveness of even OPL. After 
giving a (published) OPL model for a motivating example, we identify expres- 
siveness problems with OPL, propose a more expressive model in our language, 
called ESRA, and show that the OPL model into which it compiles is very similar 
to the one initially given. To make this paper self-contained, no prior knowledge 
of OPL is assumed here and we explain all its features that are used here. 

In the Warehouse Location problem [16], a company considers opening ware- 
houses on some candidate locations in order to supply its existing stores. Each 
possible warehouse has the same maintenance cost, and a capacity designating 
the maximum number of stores that it can supply (Ci). Each store must be sup- 
plied by exactly one open warehouse (C 2 ). The supply cost to a store depends 
on the warehouse. The objective is to determine which warehouses to open, and 
which of these warehouses should supply the various stores, such that the sum 
of the maintenance and supply costs is minimised. 
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int MaintCost = . . . ; 

int NbStores = . . . ; 

enum Warehouses . . . ; 

rEuige Stores 0. .NbStores-1; 

int Capacity [Warehouses] = 

int SupplyCost [Stores .Warehouses] = 

var int OpenWarehouses [Warehouses] in 0..1; 

var Warehouses Supplier [Stores] ; 

minimize 

surnd in Stores) SupplyCost [I , Supplier [I] ] 

+ sum(J in Warehouses) MaintCost * OpenWarehouses [J] 
subject to { 

foralld in Stores) 

OpenWarehouses [Supplier [I] ] =1 ; 
foralld in Warehouses) 

(surnd in Stores) (Supplier [I] =J) ) <= Capacity[J]; 



Fig. 1. Published OPL model of the Warehouse Location problem 



2.1 An OPL Model of the Warehouse Location Problem 

This problem can be modelled in OPL as in Figure 1, which is (a renaming of 
Statement 12.1) published in [16]. Instance data MaintCost and NbStores are 
integers read in at run-time, and so are the enumeration Warehouses of candi- 
date warehouse locations, the (1-dimensional) array Capacity with the integer 
capacities of the warehouses, and the (2-dimensional) array SupplyCost with the 
integer supply costs to the stores from the warehouses. The type Stores is an 
integer range denoting the existing stores. The set of warehouses to be opened is 
modelled as an array OpenWarehouses [Warehouses] of Boolean variables, such 
that OpenWarehouses [W] is 1 if warehouse W is open. Also, the desired mapping 
from Stores into Warehouses is modelled by an array Supplier [Stores] of 
variables ranging in Warehouses, such that Supplier [S] is W when warehouse 
W supplies store S; this representation choice captures the “exactly one” part of 
constraint C 2 . The minimize statement expresses that the addition of the sum of 
the supply costs and the sum of the maintenance costs (for the actually opened 
warehouses) must be minimal. The first forall statement expresses the “open” 
part of constraint C 2 , while the second forall statement captures Ci. A nested 
constraint, such as (Supplier [I] =J) , is seen as 1 if true, and 0 if false. 

Let us analyse this OPL model. Since set variables are not available,^ the 
modeller had to find another way of expressing that a subset of the warehouses 
is to be found. The classical ILP (integer linear programming) way of modelling 
a subset of a given set as an array of Boolean variables was used. Therefore, a 

^ Like ILOG Solver, OPL only supports ground sets, over any type (not just integers). 

OPL sets are thus only available for instance data, but not for domain variables, and 

OPL set operations thus only serve the pre-processing of instance data. 




232 Pierre Flener, Brahim Hnich, and Zeynep Kiziltan 

1 : int MaintCost = . . . ; 

2: int NbStores = 

3: enum Warehouses . . . ; 

4: range Stores 0. .NbStores-1; 

5: int Capacity [Warehouses] = 

6: int SupplyCost [Stores .Warehouses] = 

7: var {Warehouses}- OpenWarehouses ; 

8: var Stores->DpenWarehouses Supplier; 

9 : minimize 

10: sum(I->J in Supplier) SupplyCost [I , J] 

11: + card (OpenWarehouses) * MaintCost 

12: subject to { 

13: foralKJ in OpenWarehouses) 

14: countCl in Stores: I->J in Supplier) <= Capacity[J]; 

15: >; 



Fig. 2. An ESRA model of the Warehouse Location problem 



mapping from the set of stores into the entire set of warehouses had to be sought, 
instead of a mapping from the set of stores into a subset of the set of warehouses. 
This rather low-level data modelling forced a new way of perceiving the prob- 
lem, leading to a rather awkward modelling of its cost function and constraints. 
Indeed, in the cost function, the sum of the maintenance costs has to be over all 
the warehouses, with the Booleans of OpenWarehouses being reinterpreted as 
weights, instead over just the open warehouses. Also, the first forall constraint 
is entirely due to the inability of the data modelling to express that a mapping 
from a given set to a subset of another given set has to be sought. The second 
forall constraint is above reproach, however. 



2.2 An ESRA Model of the Warehouse Location Problem 

Our ESRA language allows CSP modelling at an even higher level of abstraction 
than OPL. Introducing (among others) set and mapping variables, constraints 
over sets and mappings, a card function returning the cardinality of a set, and 
a count operator counting the number of times a relation holds, we propose the 
ESRA model in Figure 2 as a more expressive formalisation of the Warehouse 
Location problem. (Line numbers were added for future reference.) The variable 
declarations elegantly express that OpenWarehouses is a subset of Warehouses, 
and that Supplier is a mapping from Stores into OpenWarehouses, and this 
without the modeller having had to worry about their internal representations. 
A more natural formulation of the cost function and constraint Ci arises from 
this, as well as a complete capture of constraint C 2 by the data modelling. 

The OPL model generated from that ESRA model is given in Figure 3. (Line 
numbers were added for future reference.) Note the similarity of this OPL model 
with the OPL model in Figure 1. The declaration parts are the same, and so are 
the optimisation parts (except that our generated OPL model exploits distribu- 
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a: int MaintCost = . . . ; 

b: int NbStores = 

c : enum Warehouses . . . ; 

d: range Stores 0 . .NbStores-1 ; 

e: int Capacity [Warehouses] = 

f: int SupplyCost [Stores , Warehouses] = 

g: var int OpenWarehouses [Warehouses] in 0..1; 

h: var Warehouses Supplier [Stores] ; 

i : minimize 

j: surnd in Stores) SupplyCost [I .Supplier [I] ] 

k: + (sum(J in Warehouses) OpenWarehouses [J] ) * MaintCost 

1: subject to { 

m: foralld in Stores) 

n : OpenWarehouses [Supplier [I] ] =1 ; 

o: foralld in Warehouses) 

p: OpenWarehouses [J] =1 => 

q: (surnd in Stores) (Supplier [I] =J) ) <= Capacity[J]; 

r : }; 

s: display (I in Warehouses: OpenWarehouses [I] =1) <I>; 

Fig. 3. Generated OPL model of the Warehouse Location problem 



tivity of multiplication over addition). In the constraint parts, the first forall 
constraints are identical, while the second forall constraints differ a bit because 
the original OPL model iterates over all warehouses whereas the proposed ESRA 
model iterates only over the open warehouses (in fact, we could have chosen the 
same iteration in the ESRA model, but we believe that it is more natural to be 
only interested in checking the capacity constraints on the open warehouses). 
Furthermore, the generated OPL model has a display part to pretty-print the 
result. The (time and space) execution behaviours are identical. 

3 The Syntax of ESRA 

We now explain the design decisions behind ESRA, introduce its syntax, and 
motivate the need for some useful high-level type constructors. 



3.1 Design Decisions 

Since OPL is arguably the most expressive constraint programming language 
available nowadays, we decided to minimise our efforts for compiling ESRA into 
some executable form by using OPL as target language. Also, since it is arguably 
not frequently possible to improve on the expressiveness of OPL, a natural choice 
was to make ESRA a conservative extension of OPL, so that entire passages of 
ESRA programs can be literally copied during compilation. Just like the designers 
of OPL, we do not really care whether our ESRA language is complete (in any 




234 Pierre Flener, Brahim Hnich, and Zeynep Kiziltan 



sense) or not, our main driving force being rather the design of a language that 
is practical for modelling at least some classes of (real-life) CSPs. 

ESRA is an extension of OPL because we introduce useful high-level type 
constructors and allow the set operators of OPL in set constraints.^ ESRA is thus 
designed to be more expressive than even OPL, and we will show that this can 
be done without compromising on efficiency. 

These design decisions allow us to benefit, as a side-effect, from the fact 
that the OPL syntax elegantly hides that OPL actually is a logic language. In- 
deed, typed quantifications are replaced by C-like type and variable declarations, 
conjunction is denoted by a semi-colon (the usual notation in imperative pro- 
gramming for sequential composition), etc. It is unfortunate that plain logic 
notation is considered repulsive by many programmers, so efforts indeed must 
be undertaken to give them a language with the look and feel of other languages. 

3.2 Syntax 

Ignoring search issues, an ESRA program consists of a declaration part, followed 
by an optional optimisation part, and a constraint part, as described next. 

In the declaration part, the syntax of OPL is applied to declare user-defined 
types, as well as typed instance data and variables. Instance data can be ini- 
tialised in the usual OPL ways, at compile-time or at run-time. The primitive 
types are the integers (int), enumerations (enum), and strings (string) of OPL. 
The type constructors are the ranges (range), records (struct), arrays (array), 
and sets ({}) (over any type) of OPL, as well as new ones (described in the next 
sub-section), namely a binary constructor (written -> and used in an infix way) 
for mappings between sets, a unary constructor (perm) for permutations of sets, 
and a binary constructor (seq) for sequences of bounded length over sets. Con- 
trary to OPL, there can even be declarations of variables of type set in ESRA. For 
lack of space, we refer to [6] for the BNF grammar of the declaration part. See 
lines 1 to 8 of Figure 2 for a sample declaration part. 

In the optimisation part and constraint part, the syntax of OPL is again used, 
this time to express the cost function that has to be optimised, and to post 
constraints. The main primitive constraints, relations, and expressions are the 
usual ones for (integer) arithmetic, (Boolean) logic, and sets. Powerful aggrega- 
tion operators such as summation (sum) and universal quantification (forall) 
are available, making more general iteration/recursion mechanisms largely un- 
necessary. Contrary to OPL, the set operators (in, subset, union, inter, card, 
etc) can also be used in ESRA constraints. Existential quantification (exists) 
and counting (count) are also new. We again refer to [6] for the BNF grammar 
of the optimisation and constraint parts. See lines 9 to 11, and lines 12 to 15, 
in Figure 2 for sample optimisation and constraint parts. 



^ As our effort is not sponsored by ILOG, we do not infringe on copyrights by choosing 
a radically new name (esra) for onr language, rather than something like OPL-I-. 
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3.3 Useful High-Level Type Constructors 

It is generally recognised that the highest-level data structures are: 

^ sequences', element containers where the union operation is associative, with 
element order and element repetition being relevant; 

— hags (or multisets): sequences where the union operation is commutative, 
making element order irrelevant; 

— sets: bags where the union operation is idempotent, making element repeti- 
tion irrelevant. 

Sequences, bags, and sets are of possibly unbounded cardinality. Their usage is 
recommended for the high-level modelling of problems. At lower levels of ab- 
straction, these data structures can be represented in a variety of ways, using 
trees, bit vectors, pointers, etc. As we are (here) not interested in sequential 
access to sequence elements nor in sequences of unbounded cardinality, we aban- 
don sequences in favour of (fixed-size) arrays, with direct access to elements. 
Similarly, we are (for the time being) not concerned with bags and infinite sets, 
and ignore them as modelling devices. 

So, equipped with (finite) sets and arrays, what can a problem modeller 
do? More precisely, are there any useful recurring modelling idioms that can be 
captured in new ways? Following D.R. Smith [12], we claim that many problems 
are of either of the following four classes: ^ 

— SUBSET: Find a subset of a given set. For example, finding a clique of a 
graph amounts to finding a subset of its vertex set. 

— MAPPING: Find a mapping from a given set to another given set. For 
example, the colouring of the countries of a map, such that any two neighbour 
countries have different colours, fits this class. 

— PERMUTATION: Find an array that represents a permutation of a given 
set. For example, scheduling jobs according to precedence constraints is a 
permutation problem. 

— SEQUENCE: Find an array that represents a sequence, of bounded cardi- 
nality, of elements drawn from a given set. For example, a (variant of the) 
travelling salesperson problem can be modelled this way, with a set of cities 
being ordered into a route, such that every city is visited at least once. 

Going beyond Smith’s classification now, we recognise that many real-life prob- 
lems are actually hybrid in nature, so that we also need to support any combi- 
nation of the four classes above. For instance, the Warehouse Location problem 
is a hybrid of SUBSET and MAPPING , because a mapping has to be found 
from the given set of stores into a subset of the given set of warehouses. 

The four classes above are thus actually not classes of stand-alone problems, 
but rather give rise to powerful high-level type constructors, of which several can 
be used in the same program. The syntax and (informal) meaning of their usage 
in variable declarations is as follows: 

® Smith actually identifies seven classes, but we discarded one here, as it is not appli- 
cable to CSPs, and we twice merged two of his classes into one. 
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— var {T} S: Set S is a subset of set T. A superset of T must be known (i.e., 
either T is a set or T is a subset of a known set). The internal representation 
of sets is hidden from the modeller, without losing in power. For instance: 

var {Vertices} Clique; 

forall(A,B in Clique) <A,B> in Edges; 

is the core of a model of the clique problem. 

— var V->W M: Mapping M is from set V into set W. Supersets of V and W must be 
known. The internal representation of mappings is hidden from the modeller, 
access being restricted as in an abstract datatype. For instance: 

var Countries->Colours M; 

forall(A,B in Countries) <A,B> in Neighbours => M[A]OM[B]; 

is the core of a model of the map colouring problem. 

— var perm(S) A: Array A represents a permutation of set S. A superset of S 
must be known. For instance: 

var perm(Jobs) Sched; 

forall(l,J in 1 . . card(Sched) )<Sched [1] ,Sched[J] >in Prec=>KJ ; 

is the core of a model of the job scheduling problem. 

— var seq(S,K) A: Array A represents a sequence, of bounded cardinality K, 
of elements drawn from set S. A superset of S must be known. For instance: 

int MaxCities = . . . ; 

var seqCCities .MaxCities) Route; 

forallCCity in Cities) City in Route; 
is the core of a model of our travelling salesperson problem. 

The SUBSET class can be usefully generalised to nSUBSETS, where the aim 
is to find a maximum of n subsets of the given set. For instance, the Warehouse 
Location problem can also be seen as finding, for each warehouse, the set of 
stores to which it delivers, i.e., finding card(Warehouses) subsets of Stores. 
(Note that these subsets must be disjoint and that their union must be Stores; 
however, this is not a partitioning problem, as some of the subsets may be empty, 
denoting the fact that some warehouses are not to be opened.) 

4 The Semantics of ESRA 

We now explain the semantics of the ESRA language, by exhibiting the archi- 
tecture of a compiler (into OPl), and showing that the main modules of that 
architecture can be easily implemented by a set of ESRA-to-OPL rewrite rules. 
Finally, we give another example of the power of our approach, by modifying 
the Warehouse Location problem, re-modelling it straightforwardly in ESRA, but 
obtaining a less intelligible and longer OPL program through compilation. 
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Composer 



^ generated ' 
OPL program 



Fig. 4. Architecture of the ESRA compiler 



4.1 Architecture of Our ESRA Compiler 

The architecture of our ESRA compiler is shown in Figure 4. First, the Decom- 
poser separates an ESRA program into its declaration, optimisation, and con- 
straint parts. Next, the ESRA-to-OPL Declaration Converter rewrites all ESRA 
declarations into OPL declarations, and possibly into some OPL constraints and 
OPL display statements (see Section 4.2 for details). Also, the ESRA-to-OPL Con- 
verter rewrites the ESRA optimisation and constraint parts into an OPL opti- 
misation part and more OPL constraints, using the declaration part (again see 
Section 4.2 for details). Finally, the Composer assembles the generated OPL pro- 
gram by suitably concatenating the obtained OPL statements. 
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The Decomposer and Composer modules are trivial, and are not discussed 
here. The converter modules are explained next. 



4.2 ESRA-to-OPL Rewrite Rules 

We use conditional rewrite rules, here written as follows: 

L => R \ C 

meaning that, if condition C holds, then expression L is rewritten into R. 

As a running example, we show how each line of the ESRA model in Figure 2 
is compiled into some line(s) of the OPL model in Figure 3. For reasons of space, 
we here only exhibit the rules that are needed to make this paper self-sufficient. 
We refer to [6] for the complete set of rules. 



ESRA-to-OPL Declaration Converter. The declarations of ESRA that involve 
types not supported (in the same way) by OPL (namely sets, mappings, permuta- 
tions, and sequences) are rewritten into OPL declarations, and possibly into some 
OPL constraints and display statements. All other declarations literally become 
OPL declarations. For instance, lines a-f are identical to lines 1 to 6. 

For set variable declarations, one of the rules is as follows: 

var {T} S; 

var int S[T] in 0..1; display (I in T: S[I]=1) <I>; 

I T is a known set 

A set S of known super-set T is thus represented, in this case, as an array of 
Boolean variables, indexed by T. This Boolean representation of sets is more 
memory-consuming than the set interval representation of CONJUNTO [8] and 
OZ [11], but both have been shown to create the same 0(2”) search space [8]. 
Moreover, the set interval representation does not allow the definition of some (to 
us) desirable high-level primitives, such as universal quantification over elements 
of non-ground sets. This is why we have resorted to the Boolean representation, 
which is only naive in appearance. A display statement is also generated, in order 
to pretty-print S. For instance, line 7 is rewritten into lines g and s because 
Warehouses is declared in line 3 as an enumeration. 

For mapping variable declarations, one of the rules is as follows: 

var V->W M; 

^ var T M[V]; foralKl in V) W[M[I]]=1; 

I V is a known set, and W is a set variable of known super-set T 

The mapping M is thus represented as an array of variables drawn from T, indexed 
by V. Furthermore, we post the constraint that every actually mapped element 
of T must be a member of W. For instance, line 8 is rewritten into lines h, m, n. 
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However, if a mapping is between set variables, then the rule is as follows: 
var V->W M; 

var int M[S,T] in 0 . . 1 ; 

foralKl in S, J in T) M[I,J]=1 => V[I]=1 & W[J]=1; 
foralKl in S) V[I]=1 => (sum(J in T) (M[I, J]=l))=l; 

I V and W are set variables of known super-sets S and T, respectively 

The mapping M is thus represented as a two-dimensional array of Booleans, 
indexed by S and T. Furthermore, we post the constraint that every actual pair 
<I , J> of M forces I and J to be members of V and W, respectively. Finally, we post 
the constraint that every element of V must be mapped to exactly one element of 
W, because modelling the mapping as a Boolean matrix does not by itself enforce 
this. This rule will be used in Section 4.3. 

For each declaration involving n sets, there are 2” rewrite rules, depending 
on whether each set is itself a variable or not. 



ESRA-to-OPL Converter. Expressions and constraints of ESRA that involve 
types not supported (in the same way) by OPL (namely sets, mappings, permu- 
tations, and sequences) are rewritten into OPL. The set operations of OPL (such 
as union, inter, in, subset) are thus now also allowed in constraints, rather 
than only in the pre-processing of instance data. Similarly for the expressions 
and constraints of ESRA that do not exist in OPL (such as card, count, exists). 
For sums over mappings, one of the rules is as follows: 

sum(I->J in M) F(I,J) 

^ sum(I in V) F(I,M[I]) 

I V is a known set, W is a set variable of known super-set T, 
and M is a mapping from V into W 

because, in this case, the mapping M is represented by an array of elements drawn 
from T, indexed by V. For instance, line 10 is rewritten into line j. 

For the cardinality of a set, one of the rules is as follows: 

card(S) 

sum(I in T) S[I] 

I S is a set variable of known super-set T 

because, in this case, the set S is represented by a Boolean array, indexed by T. 
For instance, line 11 is rewritten into line k. 

For membership in a mapping, one of the rules is as follows: 

I->J in M 
^ M[I]=J 

I V is a known set, W is a set variable of known super-set T, 
and M is a mapping from V into W 
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because, in this case, the mapping M is represented by an array of elements drawn 
from T, indexed by V, and the set W is represented by a Boolean array, indexed 
by T. For instance, part of line 14 is rewritten into part of line q. 

For count expressions over sets, one of the rules is as follows: 

countCl in S: C) ^ sum(I in S) C | S is a known set 

because, in this case, the set S is known. For instance, part of line 14 is rewritten 
into part of line q. 

For universal quantification over sets, one of the rules is as follows: 

forallCl in S) P(I); 

^ foralKl in T) S[I]=1 => P(I); 

I S is a set variable of known super-set T 

because, in this case, the set S is represented by a Boolean array, indexed by T. 
For instance, line 13 is rewritten into lines o and p. 



4.3 Modifying the Warehouse Location Problem 

Suppose we modify the Warehouse Location problem as follows (with modifica- 
tions being highlighted in italics): A company considers opening warehouses on 
some candidate locations in order to supply its existing stores, as well as possibly 
closing some of these stores, but such that a certain minimum number of stores 
remains open (C 3 ). Each possible warehouse has the same maintenance cost, and 
a capacity designating the maximum number of stores that it can supply (Ci). 
Each store that is not closed must be supplied by exactly one open warehouse 
(C 2 ). The supply cost to a store depends on the warehouse. The objective is to 
determine which warehouses to open and which stores not to close, and which of 
these warehouses should supply the various stores that are not closed, such that 
the sum of the maintenance and supply costs is minimised. 

In other words, we now look for a mapping of a subset of the stores into a 
subset of the warehouses. Figure 5 shows an ESRA model of this problem. 

The OPL model generated from that ESRA model is shown in Figure 6. Note 
that the second rule for compiling mapping variables was used here. Compar- 
ing it with the original OPL models in Figures 1 and 3, we observe not only 
that new variable declarations and constraints were added, but also that some 
existing variable declarations and constraints had to be modified, with the over- 
all code becoming quite complex. However, the higher-level of abstraction of 
ESRA allowed a very straightforward re-modelling, where only new variable dec- 
larations and constraints were added, compared to the original ESRA model in 
Figure 2, with the overall code still matching the informal problem specification 
very closely. This became possible because an ESRA modeller need not worry 
how mappings and subsets are internally represented. 
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int MaintCost = . . . ; 

int NbStores = . . . ; 

enum Warehouses . . . ; 

rEuige Stores 0. .NbStores-1; 

int Capacity [Warehouses] = 

int SupplyCost [Stores .Warehouses] = 

int MinNbStores = . . . ; 

var {Stores]- RemainingStores ; 

var {Warehouses}- QpenWarehouses ; 

var RemainingStores->OpenWarehouses Supplier; 

minimize 

sum(I->J in Supplier) SupplyCost [I, J] 

+ card (QpenWarehouses) * MaintCost 
subject to { 

card(RemainingStores) > MinNbStores; 
foralKJ in QpenWarehouses) 

count (I in Stores: I->J in Supplier) <= Capacity [J]; 

}; 



Fig. 5. An ESRA model of the modified Warehouse Location problem 



5 Conclusion 

Summary. Our contributions in this paper are (i) the proposal of high-level 
type constructors for constraint programming languages, so that CSPs can be 
modelled in more straightforward ways; (ii) the design of a practical set con- 
straint language, called esra, incorporating these ideas; and (Hi) the develop- 
ment of rewrite rules achieving a compilation from ESRA into OPL. 

We have shown that the resulting OPL programs are often very similar to OPL 
models written by human modellers. The key issue is of course that it is easier to 
write the ESRA model, because ESRA offers higher-level abstractions than OPL. 
Interestingly, this result comes at no cost to solving efficiency, precisely because 
these abstractions can be automatically mapped into OPL statements that a 
human OPL modeller would (have to) write anyway. 

Because based on OPL, our ESRA language is computationally incomplete, 
but this does not disturb us, as we just aim at speeding up the modelling (and 
solving) of at least some classes of (real-life) CSPs. This philosophy is in line with 
the current trend on domain-specific tools, such as the PlanWare [2] system 
for planning problems, or the primitives of OPL [16] for scheduling and resource 
allocation problems. 



Related Work. Several set constraint languages exist (such as CLPS [1], CON- 
JUNTO [8], NP-SPEC [3], OZ [11], and {log} [4]), but none of them seems as 
expressive (or fast) as our proposal, for lack of the rich data and constraint 
modelling mechanisms of OPL, and as none of them features all the additional 
type constructors we advocate here. The first three languages are limited to con- 
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int MaintCost = . . . ; 

int NbStores = . . . ; 

enum Warehouses . . . ; 

rEuige Stores 0. .NbStores-1; 

int Capacity [Warehouses] = 

int SupplyCost [Stores .Warehouses] = 

int MinNbStores = . . . ; 

var int RemainingStores [Stores] in 0..1; 
var int OpenWarehouses [Warehouses] in 0..1; 
var int Supplier [Stores .Warehouses] in 0..1; 
minimize 

surnd in Stores. J in Warehouses) SupplyCost [I . J] * Supplier [I . J] 

+ (sum(J in Warehouses) OpenWarehouses [J] ) * MaintCost 
subject to { 

foralld in Stores. J in Warehouses) 

Supplier [I . J] =1 => RemainingStores [I] =1 & OpenWarehouses [J] =1 ; 
foralld in Stores) 

RemainingStores [I] =1 => (sum(J in Warehouses) Supplier [I . J] ) = 1; 
(surnd in Stores) RemainingStores [I] ) > MinNbStores; 
foralld in Warehouses) 

OpenWarehouses [J]=l=> (surnd in Stores) (Supplier [I . J] =1) )<=Capacity [J] ; 

}; 

display (I in Stores: RemainingStores [I] =1) <I>; 

display (I in Warehouses: OpenWarehouses [I] =1) <I>; 

display (I in Stores. J in Warehouses: Supplier [I . J] =1) <I.J>; 

Fig. 6. Generated OPL model of the modified Warehouse Location problem 



straints on sets of initially known cardinality, and some of them do not support 
optimisation problems. 

A language-independent computer-assisted constraint programming architec- 
ture was proposed [15], but it does not support set constraints. 

Taking a completely different approach, D.R. Smith developed the program 
synthesisers kids [12,13], DesignWare [14], and PlanWare [2], which semi- 
automatically compile high-level specifications written in refine into applicative 
programs. When applied to specifications of CSPs, these systems excel (at help- 
ing) in the generation of high-speed problem-specific solvers, which have been 
deployed numerous times in industry, as they often outperform operations re- 
search or constraint programming solvers. Significant amounts of theorem prov- 
ing and (computer-assisted) program optimisation are performed. These synthe- 
sisers support possibly infinite sequences, bags, and sets, and allow thus more 
expressive modelling than ESRA. 

Our work is a result of adapting the fundamental ideas of kids and its suc- 
cessors to the generation of constraint programs (see [5] for an early report). 

Future Work. In order to fulfill our design intention of making ESRA also more 
declarative than OPL, we are currently investigating the compile-time generation 
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of also a (procedural) OPL search part by analysis of a (declarative) ESRA model 
that has no such search part. First-generation constraint solvers were black boxes 
and thus did not allow the formulation of model-specific labelling heuristics. The 
current second-generation solvers, such as the one for OPL, provide an elaborate 
notation for expressing such heuristics. As necessary as this may be, doing so 
remains more of an art than a science. We thus see no reason not to quietly 
prepare a third generation, where model-specific search parts in such a notation 
are actually synthesised. See [7,10] for first results. 

We also study the reformulation of ESRA models, by investigating the merits 
of alternative OPL representations of high-level data structures, by considering 
the integration of alternative models, and by examining the generation of implied 
constraints. See [9] for first results. 
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Abstract. The problem of deciding timed bisimilarity has received in- 
creasing attention; it is important for verification of timed systems. Using 
a characterization of timed bisimilarity in terms of models of constraint 
databases, we present to our knowledge, the first local, symbolic algo- 
rithm for deciding timed bisimilarity; previous algorithms were based 
on a finite, but prohibitively large, abstraction (the region graph or the 
full backward stable graph). Our algorithm uses XSB-style tabling with 
constraints. Our methodology is more general than those followed in the 
previous approaches in the sense that our algorithm can be used to decide 
whether two timed systems are alternating timed bisimilar. 



1 Introduction 

The problem of deciding whether two timed systems are timed bisimilar has 
received a lot of attention [Cer92, ACH94, LLW95, WL97]; the problem is im- 
portant because it arises in the course of checking equivalence between two 
timed processes — a step often invoked in the verification procedure for real 
time systems. It has been argued [WL97] that timed bisimulation is the suit- 
able notion of bisimulation for real time systems. Previous solutions to the 
problem of deciding whether two timed systems are timed timed bisimilar 
were based either on a finite, but prohibitively large, abstraction (the region 
graph [Cer92, LLW95, ACH94]) or on computing “good points” on a full back- 
ward stable graph [WL97]. Note that unlike the time-abstract bisimilarity which 
induces a finite number of equivalence classes (thanks to the finiteness of the re- 
gion graph), the timed bisimilarity (equivalence) may generate an infinite number 
of equivalence classes. 

It is well known [McM93, HNSY94] that symbolic model checking algorithms 
perform better in practice than enumerative ones. The success of on-the-fly (lo- 
cal) symbolic algorithms [BLL+96] for model checking for timed systems has 
been well documented in the literature. However, to the best of their knowledge, 
the authors do not know of any such local symbolic algorithm for deciding timed 
bisimilarity. 

Using a new characterization of timed bisimilarity in terms of models of con- 
straint databases, we present, to our knowledge, the first local, symbolic and 
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practical algorithm for deciding (strong) timed bisimilarity between two timed 
logic processes. To this end, we first identify a fragment of constraint query lan- 
guages over reals (in the sense of [KKR95, JM94, Rev90]) that allows us to model 
real time systems. We call the programs expressed in this fragment as timed logic 
processes (TLPs). We then establish a connection of timed logic processes with 
the standard model of timed automata [AD94]. Precisely, we show that the TLP 
“model” subsumes the timed automaton model, i.e., timed automata can be 
translated to TLPs. We then define the notion of (strong) timed bisimilarity 
between two timed logic processes. Using a product construction, we reduce the 
problem of deciding whether two timed logic processes are timed bisimilar to the 
membership problem for the model theoretic semantics of constraint query lan- 
guage programs. More specifically, given two TLPs V and Q, we use a product 
construction to generate a product constraint query language program T from 
V and Q such that V and Q are timed bisimilar if and only if the initial atom is 
not in the least model of T (see Section 4) . To obtain a local and symbolic proce- 
dure for checking membership in the least model semantics of TLPs, we extend 
with constraints the OLDT resolution of [TS86]. To guarantee the termination 
of the procedure for checking membership for the least model semantics of timed 
logic processes, we combine it with the trimming operation on constraints de- 
scribed in [MP99]. This operation, which is orthogonal to the procedure, allows 
us to avoid the computationally expensive operation of splitting constraints (in 
contrast with [SS95]) while still, if combined with the procedure, guarantees its 
termination. Once the two processes are not found to be timed bisimilar, our 
algorithm provides diagnostics. We next define the notion of alternating timed 
bisimilarity between two timed logic processes. The notion of alternating timed 
bisimilarity has not been considered in literature before; though [AHKV98] dealt 
with the problem of alternating bisimilarity for the finite state case. Using the 
same methodology as described above, we obtain a local symbolic algorithm for 
deciding alternating timed bisimilarity. Our methodology is robust in the sense 
that the basic ingredients of it can be used to provide efficient solutions to a 
wide range of verification problems for timed logic processes. 

2 Timed Logic Processes 

We identify a fragment of constraint query languages over reals (in the sense 
of [KKR95, Rev90, BS91]) that will allows us to model real-time systems. We 
refer to the programs expressed in this fragment as timed logic processes (abbre- 
viated TLPs). Before we define TLPs formally, we need the following notations 
and definitions. Let the constraint 7 be defined by the grammar 

7 ::= true\xi > c | Xj < c | > c | < c | 7 A 7 ( 1 ) 

where c G Af, the set of natural numbers and Xi are variables. All the results in 
this paper still hold true even if 7 consists of conjuncts of the form Xi — Xj relop d 
where relop G {>, <, <, >} and d is an integer. 
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Definition 1 (t-clause). A t-clause over is a clause of one of the following 
four forms. 

(1) p{x) < — ‘P,p'{x') 

(2) p{x) < — pi{x),p2{x) 

(3) p{x) < — 7 

(4) init < — p{x),x = 0 

where the constraints tp are of the forms (here n is the length of the tuple x = 
{x\, . . . ,Xn) of variables ) 

(1.1) ip = 7i(a;) A A”=i = Xi + z A z > OA 72(3?') (“time transitions”) 

(1.2) p = 7 i(£c) A Aies A = 0 A Ai^s x'^ = XiA 72 ( 2 ;') (“edge transitions”) 

where S C { 1 , . . . , n} and the constraints 7 are of the form defined in the gram- 
mar ( 1 ). 

We call the constraints 7 the guards of the clauses. In the sequel, we call a clause 
of the form ( 1 ) as an evolution clause if the constraint p is of the form ( 1 . 1 ) 
and as a system clause if the constraint p is of the form (1.2). We will also call 
clauses of the form (2) as alternating clauses, clause of the form (3) (which are 
facts or generalized tuples) as assertions and clauses of the form (4) as initial 
clauses. 

Definition 2 (TLP). An (unlabeled) TLP is a (finite) set of t-clauses in which 
at least one clause is an initial clause. 

We associate a logical formula corresponding to a TLP in the same way as 
in [JM94]. The semantics of a TLP (over the domain of reals) are given in the 
usual way [JM94] in terms of the logical formula associated with a TLP. Note 
that the clauses, in which the constraint p in the body is of the form ( 1 . 1 ), 
contain the variable z in the body. The existentially quantified (in the logical 
formula associated to a TLP) variables z are called increment variables. 

A first motivation for TLPs is that this “model” subsumes the timed au- 
tomata [AD94] model; i.e., we can translate timed automata to timed logic 
processes. These translations use only evolution clauses (clauses obtained by 
translating time transitions), system clauses (clauses obtained by translating 
edge transitions) and an initial clause (a clause specifying an initial position). 
Of the other types of clauses, clauses of the form (2) (i.e., alternating clauses) 
are used for expressing alternation (compare [DW99]). These clauses are useful 
in modeling real time scheduling problems in which multiple jobs are executed 
simultaneously on multiple processors independently of one another. Clauses of 
the form (3) (i.e., assertions) are used to rewrite an agent to a nil agent (in this 
respect there is a similarity with process algebras; thus p{x) < — 7 states that 
the agent p can rewrite to the nil agent if the values of the variables x satisfy the 
formula 7 ) . These clauses can also be used to express assertions about processes 
(e.g., by rewriting an agent to the nil agent if the values of the variables x violate 
a safety property) . Thus the TLP framework not only allows modeling a system, 
but also allows writing assertions about the behaviors of the system. 
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3 Translation of Timed Automata into TLPs 

Since the timed automaton model (see [AD94] for an introduction to this model) 
is predominantly used in the literature, we show the connection of the timed logic 
process model with the timed automaton model. In other words, we show that 
timed automata can be translated to TLPs. The construction of a TLP from a 
timed automaton is given below. 

Construction 1 Let 



U = {AP, X„, L, E, P, e°, inv) 

be a timed automaton [AD94] with n clocks, where AP is a set of atomic propo- 
sitions, Xn is a set of clocks (n clocks x\, . . . , x„), L is a set of locations, if is a 
set of edges, P is a labeling function that labels each location with a set of atomic 
propositions, G L is the initial location and inv is a function that assigns to 
each location an invariant constraint. We translate U to a TLP V as follows. 
For each location £ G L, we introduce an n-ary predicate £{x). For each location 
£ G L, we have an evolution clause where 71 and 72 are both the invariant of 
the location £ (i.e., 72(3;^) is obtained from 7i(a?) by renaming all variables in 
the tuple X by their primed versions in the tuple x'). Thus the evolution clause 
takes the form 

£{x) < — £(x') A (p 

where 

n 

p = invi{x) A ^x'^ = Xi-\-zAz>0A inve{x') 

i=l 

{invi is the invariant of the location £). For each edge {£, 6, Reset, £') G E from 
£ to £' , where 6 is the guard of the edge and Reset is the set of clocks reset in 
that edge, we have a clause of the form (1.2) with head predicate £{x) and body 
predicate £'{x), where 71 = 0 A invi{x), 72 = invi' and S = Reset (here invf 
and inv£' are respectively the invariants of locations £ and £'). Thus the system 
clause takes the form 

£{x) < — £'{x') A p 

where 

p = invi{x) A /y = 0 A x'^ = Xi A invei{x'). 

i^Reset i^Reset 

We also add an initial clause init < — £'^{x) Ax = 0. The labeling function P is 
extended to the predicates in the canonical way. 

1 

TZ-hase: For a constraint query language program V over reals, with set 
of predicate symbols Pred (assuming each predicate is n-ary), the TZ-base of 
V (where TZ is the set of non-negative real numbers) is defined as {p{v) \ p G 
Pred, V G TZ”'}. 

In the sequel, whenever we refer to resolution/resolvent, unless otherwise 
mentioned, we will mean SLD/OLD resolution/resolvent [TS86]. 
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Theorem 1 (Meaning of translation). For every timed automaton U, there 
exists an TLP V ( obtained by Construction 1 ) such that the transitions of U are 
exactly the resolution steps ofV and the set of positions [AD94] ofU are exactly 
the TZ-base ofV. 

4 Timed Bisimilarity 

In this section, we present a method for deciding timed bisimilarity for timed 
logic processes. Since the definition of timed bisimilarity (see below) is inherently 
non-alternating (i.e., assumes non-alternating timed systems), in this section, we 
assume non-alternating TLPs; that is TLPs without alternating clauses. We also 
assume that the TLPs do not involve assertion clauses (since the definition of 
timed bisimilarity does not involve terminating TLPs). The alternating clauses 
will be used in the definition of alternating timed bisimilarity (below). For the 
sake of simplicity, we assume that in evolution clauses both 7 and 7 ' are true 
and in system clauses 7 ' is true. The characterizations obtained below can be 
trivially extended to general TLPs. Usually, for a TLP V, we will denote the 
initial predicate by init-p . We assume that a TLP V is equipped with a (finite) 
alphabet^ S and that each initial, system or evolution clause is labeled by a 
letter (action) from this alphabet (a letter can label several clauses). We let 
a, b, c range over S. We refer to such TLPs as labeled TLPs. The definition is 
formalized below. 

Definition 3 (Labeled TLP). A labeled TLP with an alphabet S is a TLP 
consisting only of initial, system and evolution clauses in which every clause is 
labeled by a letter from S. 

In this section, whenever we refer to a TLP, we will actually mean a labeled 
TLP. It can be easily shown that a timed automaton in which each edge is 
labeled by a letter from an alphabet S can be translated to a labeled TLP with 
alphabet U U {0, . . . , |L| — 1} U {init} where L is the set of locations of the timed 
automaton. We will illustrate this with an example below. We will refer to the 
resolvent of a ground atom and a clause as a ground resolvent. Below we define 
the notion of timed bisimilarity. Roughly, two timed logic processes are timed 
bisimilar if their “behaviors” are same in “all” respects. 

Definition 4 (Timed Bisimulation). Given two TLPs V and Q(assume that 
they have the same alphabet E) with TZ-bases Tip and TLq respectively, a binary 
relation Up x Hq is a timed bisimulation if (p{v),t{v')) G^tb implies the 
following: 

— For each ground resolvent p' (w) G Hp of p{v) through an evolution clause 
labeled 'a' with f as the value (assignment) of the increment variable in the 
resolution, there exists a ground resolvent t'{w') G Hq of t{v') through an 
evolution clause labeled 'a' of Q with f as the value of the increment variable, 
such that {p'{w),t'{w')) G^tb- 

^ Note that we have not included “silent” actions here. But the results can be extended 
to include silent actions 
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— For each ground resolvent t'{w') e Hq of t{v') through an evolution clause 
labeled 'a' with ^ as the value (assignment) of the increment variable in the 
resolution, there exists a ground resolvent p' {w') S Ti-p of p{v) through an 
evolution clause labeled 'a' ofV with ( as the value of the increment variable, 
such that {p'{w),t'{w')) G^tb- 

— For each ground resolvent p'{w) of p{v) through a system clause (or initial 
clause) labeled 'a' in V, there exists a ground resolvent t'{w') oft{v') through 
a system clause labeled 'a' in Q such that (p' {w),t' {w')) G^tb- 

— For each ground resolvent t'(w') of t{v') through a system clause (or initial 
clause) labeled' a' in Q, there exists a ground resolvent p' (w) ofp(v) through 
a system clause labeled 'a' in V such that {p' {w),t' {w')) G^tb- 

We say that V and Q are timed bisimilar iff there exists a timed bisimulation 
^tbQ Ftp X Ti,Q such that {initp,initQ) G^tb, where initp and initQ are the 
initial (0-ary) predicates ofV and Q respectively. 

In the sequel, for ground atoms p and q, we write p ^tb q instead of 
(p,q) G^tb- It can be easily proved that the largest timed bisimilarity (it ex- 
ists see e.g., [Mil89]) is an equivalence relation. We are actually interested in 
the largest timed bisimilarity. We now relate our definition of timed bisimilarity 
with the notion of strong timed bisimilarity defined in [Cer92]. 

Theorem 2. Given two timed automata Ui and U 2 (with each edge transition 
labeled by a letter), U\ and U 2 are strong timed bisimilar [Cer92] iff the corre- 
sponding TLPs Vjji and Vu^ as obtained by Construction 1 are timed bisimilar. 




We now illustrate how a timed automaton in which each edge transition is la- 
beled by a letter can be translated to a labeled TLP. The translation follows 
Construction 1 with the labels assigned in the obvious way. Consider the two 
timed automata shown in Figure 1. The first timed automaton has two loca- 
tions 10 and 11 and one clock x. The edge transition from 10 to 11 is labeled 
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’a’. The second timed automaton has two locations mO and ml and one clock 
y. The edge transition from mO to ml is labeled by ’a’. The alphabet for each 
automaton is {a}. Consider the TLPs corresponding to the two timed automata 
shown in Figure 1. The TLP corresponding to the first timed automaton will 
have an initial clause initi < — I0{x) A a; = 0 (labeled by init), evolution clauses 
I0{x) < — ^0(a;') Ax' = x + zAz>0, ll{x) < — ll(x') Ax' = x + zAz>0 (we 
assume that the evolution clauses are labeled 0 and 1 respectively) and a system 
clause ^0(a;) < — ll(x') Ax>lAx' = x (labeled ’a’). Notice that we assign 
extra labels to the initial and evolution clauses. Thus the alphabet for the TLP 
corresponding to the first timed automaton is {init/ a' ,0, 1} (where the timed 
automaton has two locations). Similarly the TLP corresponding to the second 
timed automaton can be constructed. The alphabet for the TLP correspond- 
ing to the second timed automaton is {init,' a' ,0, 1}. The TLPs corresponding 
to the two timed automata are not timed bisimilar because the (SLD) deriva- 
tion initi — *■ ^O(O) — > ^O(l.l) — > n(l.l) ... of the TLP corresponding to the 
first timed automaton cannot be matched by a corresponding one by the TLP 
corresponding to the second timed automaton. 



init 

{initi, init 2 ) 

{I0,m0){x,y) 

{I0,m0){x,y) 

{ll,ml){x,y) 

{I0,m0){x,y) 

{I0,m0){x,y) 



{initi , init 2 ) 

{10, mO) (x,y) A X = 0 A y = 0 

{10, mO){x' , y') A x' = X + z,y' = y + z A z > 0 

{11, ml){x' , y') A x' = X A y' = y A X > 1 Ay < 1 

{11, ml){x' , y')Ax' = x + zAy' = y + zAz>0 

X > 1 A y > 1 

X < 1 A y < 1 



Fig. 2. The Product Program 



4.1 Construction of Product Program 

Given a TLP V in which every predicate is m-ary and a TLP Q in which every 
predicate is n-ary with a common alphabet S, we construct a product constraint 
query language program T in which each predicate is m -I- n-ary such that V and 
Q are timed bisimilar iff init (see below) is not in the least model of T. Note that 
the notion of timed bisimilarity can be described as a two-player game [ACH94] . 
The product program below essentially simulates this game. The construction is 
as follows. 

Construction 2 First create predicates {init-p ,inito) and init. Create the 
clause init < — {initp,initQ) . For each predicate {pviPq) created above ex- 
pand (i.e., create rule(s) defining that predicate) using the following rules if the 
predicate has not been already expanded. 
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1. For each evolution clause in V labeled 'a' and having p-p in its head, do 
the following. Let C\,...,Ck be all the evolution clauses in Q labeled 'a' 
that have pQ as head. Create a clause of the form {pv,PQ){x,y) < — 

A A A*Li AJa; ^ y,x' ^ where a; (a;') is 
an m-tuple of variables, y (y^) are n tuples of variables, pp and p occur in 
the body of the clause of V under consideration, p'q^ and occur in the 
body of the clause Q. Do the above for each a in E. 

2. Do point 1 above with the roles of V and Q reversed. 

3. For each system (or initial) clause C in 7^ labeled 'a' and having pp in 

its head, do the following. Let Ci, . . . ,Ck be all the system (or initial) 
clauses in Q labeled 'a' that have pQ as head. Create a clause of the form 
{pp,Pq){x, y) < — pi(£c, y) A . . . A pk{x, y). Also create clauses of the form 
Pi{x,y) < — {pp,p'Q^){x',y') A<p A p'^[x ^ y,y ^ y'] where x {x') is an 
m-tuple of variables, y {y') is an n tuple of variables, p'-p and p occur in 
the body of the clause of V under consideration, Pq^ and ip^ occur in the 
body of the clause Ci . Let 71 , . . . , be the guards of the clauses Ci , . . . , Cfc 

respectively and let 7 be the guard of the clause C. Create a fact of the form 
{£p,£o}{x,y) < — 7 A (^(71 V ... V jk)[x y]). Do the above for each a in 

r. 

4. Do point 3 above with the roles of V and Q reversed. 

5. Repeat the above steps until there is no predicate that is created by the 
above method (i.e., stands in the body of a clause created by the above 
method) but has not been expanded. 

1 

It is easy to see that the construction of the product program terminates. 
The resulting program is a constraint query language program in which every 
predicate is m -I- n-ary. Note that the product program is a constraint query 
language program and not a labeled TLP. The first and the second items in 
Construction 2 tries to detect when a “move” of one of the TLPs through an 
evolution clauses cannot be matched by “move” of the other. The third and the 
fourth items in Construction 2 does the same for initial and system clauses. We 
first illustrate the product construction above through an example. 

Example 1. Consider the TLPs corresponding to the timed automata shown in 
Figure 1. As discussed before, the TLPs corresponding to the timed automata 
are not timed bisimilar. The product program is given in Figure 2 where initi 
and init 2 are respectively the initial predicates in the translation to TLPs of the 
first and second timed automata. The second clause is corresponding to the third 
item in Construction 2 for the label init. The third clause is due to the first item 
in Construction 2 corresponding to the label 0. The fourth and sixth clauses are 
due to the third item in Construction 2 corresponding to label ’a’. The seventh 
clause is due to the the fourth item in Construction 2 corresponding to label ’a’. 
The fifth clause is due to the first item in Construction 2 corresponding to label 
1. The clauses produced by the combinations of other items of Construction 2 

^ a; I— > y denotes renaming 
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with other labels are the same as those produced above. It can be easily seen 
that there is a successful derivation from init in the product program (i.e., init 
is in the least model of the program). 

Theorem 3. Given TLPs V and Q, V and Q are timed bisimilar if and only if 
init is not in the least model ofT (as created above). 

Proof. The proof is given in the full version of the paper available from the 
author’s webpages. | 

It is to be noted that the product construction described above does not 
correspond to the weak product of timed graphs described in [WL97]. Also note 
that the product of two TLPs described above is not a TLP while the weak 
product of two timed graphs is a timed graph. Our product construction tries 
to find out when one of the TLPs fails to “simulate” a move of the other; this 
is indicated by a success through a fact. Thus a win of the adversary in the two 
player game of [ACH94] is simulated by a successful (SLD) derivation. 

4.2 Implementation 

To prove that two TLPs V and Q are timed bisimilar, we try to prove that init is 
not in the least model of the product constraint query language program T. We 
can either compute the least model of T using least fixpoint of the immediate 
consequence operator for T [JM94] . This results in a (symbolic) global algorithm 
for deciding timed bisimilarity. In our implementation, we follow an alternative 
strategy. Following [MP99], we extend XSB-style tabling [CW96, TS86, Vie87] 
with constraints to prove that init does not succeed in the top-down execution 
using the non-ground transition system^. To be precise, our method is an exten- 
sion of OLDT resolution of [TS86] . The correctness is obtained extending stan- 
dard results from logic programming: the non-ground state init (a non-ground 
state is a pair consisting of a predicate and a constraint store) succeeds (i.e., 
has a ground instance which succeeds in the usual logic programming sense; 
note that we can view a 0-ary predicate po as a non-ground state (po, true)) 

® Given a constraint query language program P, we define the non-ground transition 
system induced by P as follows. Let (P, y>) be a non-ground goal (a nonground goal 
is a pair (P, ip) where P is a conjunction of literals and is a constraint). Let p{x) 
be an atom in the conjunct P. Let P' = P \ {p} be the conjunction of all predicates 
in P other than p{x). Let G be a clause in P whose head unifies with p{x). Let B be 
the conjunction of literals in the body of C. Then the non-ground transition system 
induced by P is the transition system whose set of states is the set of non-ground 
goals and the transition relation — >ng is defined by: 

(-P, ip) '>ng {Q,<P ) 

where Q = B A P' and ip' = 3-(variabies(p'),Variabies(B)){ifi A {(3-a=ip) A 0 A f))) where 
tp is the constraint in C, Variables(P') {Variables (B)) are the free variables in P' 
(B) and 0 is the mgu of p{x) and the head of C where the existential quantifier is 
over all variables but x. 
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iff it succeeds in the derivation tree obtained by using tabled resolution (using 
clauses in T). To guarantee termination, we use a modified form of tabling: 
to check whether a non-ground state {pred{x),ip) succeeds in T we will check 
whether {pred{x),trim{ip)) succeeds in T where the operation trim is described 
in [MP99]. For space limitations, we do not describe the full tabling algorithm. 
It is described in the full version of the paper available from the author’s web 
pages. Below, we give a brief intuitive description of the essential points of the 
algorithm. The algorithm consists in maintaining a table r in which each entry, 
indexed by a nonground state, consists of a list of solutions (constraints). For 
a nonground goal (see footnote 3 for definition of nonground goal) (P, ip) with 
p{x) being the leftmost atom, if there is a table entry corresponding to {p{x),'tp) 
such that 3-xP |= V’ (where the existential quantifier is over all variables but 
x), we mark this goal as a tabled goal and resolve it with the constraints (facts) 
in the solution list of {p{x),'tp). Otherwise, we mark this goal as a solution goal 
and use OLD resolution and insert a table entry corresponding to {p{x), 3-xP) 
with an empty solution list. Whenever a nonground state (partially) succeeds, 
we register the answers obtained in the table entry corresponding to that state. 
Note that the tabling strategy produces a (symbolic) local algorithm. The algo- 
rithm is local since the state space is explored in a demand-driven fashion. Note 
that we only explore the state space needed to prove that {init-ppnitQ) G^tb 
(or not). To the best of the knowledge of the authors, this is the first (symbolic) 
local algorithm for deciding timed bisimilarity for real-time systems. The follow- 
ing theorem guarantees the termination of the algorithm. It is to be noted that 
without the trim operation, the procedure described above is not guaranteed to 
terminate. 

Theorem 4. The local algorithm (combined with the trim operation) for decid- 
ing timed bisimilarity mentioned above terminates. 



Note that the even though the size of the product program can be in the 
worst case exponential in the maximum arity of the predicates of the two TLPs, 
we may not actually have to build the whole product program. Rather, we can 
construct the product program on-the-fiy along with the OLDT resolution pro- 
cedure described above. The details are straightforward. Once the two TLPs 
are not found timed bisimilar, a counter example is provided using the methods 
in [MP99]. 

We have implemented the algorithm described above using the CLP (7^) li- 
brary of Sicstus Prolog. We have used our implementation to check that the 
TLPs corresponding to the two timed automata described in Figure 1 are not 
timed bisimilar. The total time taken (including the product construction) on a 
200 MHz Ultra Sparc is negligibly small. We are currently experimenting to see 
the performance of our implementation on larger practical examples. 




Constraint Database Models Characterizing Timed Bisimilarity 255 



5 Alternating Timed Bisimnlation 

In this section, we consider TLPs consisting of evolution, initial, system and 
alternating clauses. In addition to the assumptions in the previous section, we 
assume that each alternation clause is also labeled by a letter of the alphabet E. 
We call such TLPs extended labeled TLPs. The definition is formalized below. 

Definition 5 (Extended Labeled TLP). An extended labeled TLP with an 
alphabet E is a TLP consisting only of initial, system, evolution and alternation 
clauses in which every clause is labeled by a letter from E. 

Note that the class of extended labeled TLPs subsume the class of labeled 
TLPs defined in Section 4. In this section, whenever we refer to a TLP we will 
actually mean an extended labeled TLP. 

Definition 6 (Alternating Timed Bisimnlation). Given two TLPs V and 
Q(assume that they have the same alphabet E ) with TZ-bases TL-p and TLq respec- 
tively, a binary relation ^atbQ Td-v xTIq is an alternating timed bisimulation if it 
is a timed bisimulation and in addition {p{v),t{v')) G^atb implies the following: 

— For each ground resolvent P (a conjunction of ground atoms) ofp{v) through 
an alternating clause of V labeled ' a! , there exists a ground resolvent T 
of t{v') through an alternating clause labeled 'a' in Q such that if t'{w) 
is a conjunct in T, then there exists a conjunct p'{u) in P such that 
{p'{u),t'\w)) &^atb- 

— For each ground resolvent T (a conjunction of ground atoms) oft(v') through 
an alternating clause of Q labeled 'a' , there exists a ground resolvent P 
of p{v) through an alternating clause of V labeled 'a' such that if p'{u) 
is a conjunct in P, then there exists a conjunct t'(w) in T such that 
{p'{u),t'{w)) £^atb- 

We say that V and Q are alternating timed bisimilar iff there exists an alternat- 
ing timed bisimulation ‘^atbO. Ftp x TLq such that {initp,inito) G^atb, where 
initp and initQ are the initial (0-ary) predicates ofV and Q respectively. 

5.1 Product Program 

Given a TLP V in which each predicate is m-ary and a TLP Q in which each 
predicate is n-ary, with a common alphabet E, we construct product constraint 
query language program T in which each predicate is m -I- n-ary such that V and 
Q are alternating timed bisimilar iff init (see below) is not in the least model of 
T. The product construction extends that described for detecting timed bisim- 
ilarity above. In particular, in addition to the first four clauses in the product 
construction of Construction 2 described above, it also contains the following 
two clauses. 

— For each alternating clause C = pp{x) < — ppi{x) A pp 2 (x) in V labeled 
'a' and having pp in its head, do the following. Let C\,. . . ,Ck be all the 
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alternating clause labeled 'a! in Q that have pQ in their head. Let Ci 
be of the form PQ{y) < — PqHu) ^Pq'Hv)- Create a clause of the form 
{pv,PQ){x,y) < — pi{x,y) A ... Apk{x,y). Also for each pi, create the 
clauses Pi{x) < — Aj=i Afe=i ?/)■ fo'' ®^ch each j = 1..2 and 

k = 1..2 create the clauses q{^{x,y) < — {p-pj,pQ^^){x,y) and q{'^{x,y) < — 
{p'Pf.,P'p‘f){x,y). If there exists no alternating clause in Q labeled 'a' (i.e., 
if fc = 0 ), add the fact {pv,PQ){x,y) < — true. Do the above for each 'a' in 

r. 

— Do the above step with the roles of V and Q interchanged. 

The constructed program T is still a constraint query language program in 
which each predicate is to + n-ary. 

Theorem 5. Given TLPs V and Q, V and Q are alternating timed bisimilar if 
and only if init is not in the least model ofT (as created above). 

The implementation of the algorithm for deciding alternating timed bisimi- 
larity, using the methodology described above, remains the same as the one for 
timed bisimilarity. 



6 Related Work 

Timed bisimilarity has been considered in [Cer92, ACH94, LLW95, WL97]. 
The first three approaches are based directly on the region graph. Of these, 
in [LLW95], the authors reduced the problem of deciding timed bisimilarity to 
the problem of model checking for “characteristic formulae” . Cerans [Cer92] con- 
structs the region graph for the product state space and then checks for bisim- 
ilarity. None of these algorithms were either local or symbolic. They essentially 
require storing the whole region graph or its encoding (which is prohibitively 
large) in memory. Compared to this, our algorithm is both local and symbolic. 
Weise and Lenzes [WL97] gave an algorithm for deciding timed bisimilarity based 
on detecting “good points” on a backward stable graph. Their methodology in- 
volves construction of the weak product of the two given timed graphs and then 
computing the full backward stable graph G of the weak product. Finally they 
try to detect a subgraph of G where all the relevant nodes are “good” in a cer- 
tain sense. Their algorithm, strictly speaking, is not local. Also, we have not 
found any easy way to extend their algorithm to deal with the problem of de- 
ciding alternating timed bisimilarity. In contrast, our methodology reduces the 
problem of deciding timed bisimilarity to that of checking membership in the 
model-theoretic semantics of constraint query language programs. The latter 
problem, as we show below, can be solved using a local symbolic algorithm. In 
this context, we should note that the product construction described in the se- 
quel is not related to the weak product construction presented in [WL97]. Also, 
in contrast with [WL97], our methodology, as we show below, can easily deal 
with the problem of deciding alternating timed bisimilarity. 




Constraint Database Models Characterizing Timed Bisimilarity 257 



In the finite state case, alternating bisimilarity has been considered 
by [AHKV98]. Alternating timed bisimilarity has not been considered in lit- 
erature before. Our definition of alternating timed bisimilarity can be seen as a 
“timed” version of the alternating bisimilarity considered in [AHKV98]. 

Logic Programming/Database-Theoretic approaches to verification of both fi- 
nite and infinite state systems have been considered by several authors [RRR'*'97, 
CDD+98, GP97, Gup99]. But none of them have addressed the problems that 
we have considered in this paper. In [FP93] the authors provide a deductive 
method for checking whether two goals are “equivalent” in a certain sense. But 
their method does not provide any algorithm for deciding timed bisimilarity for 
timed logic processes. 
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Abstract. We extend Temporal Annotated Constraint Logic Program- 
ming (TACLP) in order to obtain a framework where both temporal and 
spatial information can be dealt with and reasoned about. This results in 
a conceptually simple, uniform setting, called STACLP (Spatio-Temporal 
Annotated Constraint Logic Programming), where temporal and spatial 
data are represented by means of annotations that label atomic first- 
order formulae. The expressiveness and conciseness of the approach are 
illustrated by means of some examples: Definite, periodic and indefinite 
spatio-temporal information involving time-varying objects and proper- 
ties can be handled in a natural way. 

Keywords: Spatio-temporal reasoning, constraint logic programming, 
annotated logics. 



1 Introduction 

The handling of spatio-temporal data has recently begun to attract broader in- 
terest [1,3,8,21]. The more computers support humans, the more have they to 
be capable of dealing with spatio-temporal information. Time and space are 
closely interconnected: Much information that is referenced to space is also ref- 
erenced to time and may vary with time. Prominent applications are geographic 
information systems (GISs), environmental monitoring, and geometric modeling 
systems (CAD). Other areas in need for spatio-temporal reasoning are cadastral 
databases, diagrammatic reasoning and scientific applications. 

In a logical formalization of spatial and temporal information it is quite nat- 
ural to think of formulae that are labelled by spatio-temporal information and 
about proof procedures that take into account these labels. In this perspective, 
the general framework of annotated constraint logic (ACL) [7] extends first or- 
der languages with a distinguished class of predicates, called constraints, and 
a distinguished class of terms forming a lattice, called annotations, which are 
used to label formulae. Semantically, ACL provides inference rules for reasoning 
on annotated formulae and a constraint theory defining lattice operations, i.e. 
constraints, on annotations. One advantage of a language in the ACL family 
is that its clausal fragment can be efficiently implemented [7]: Given a logic in 
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this framework, there is a systematic way to make a clausal fragment executable 
as a constraint logic program [13]. Based on the ACL framework, the family 
of temporal annotated constraint logic programming (TACLP) languages has 
been developed and investigated in [7,12,16]. TACLP supports qualitative and 
quantitative (metric) temporal reasoning involving both time points and time 
periods (time intervals) and their duration. Moreover, it allows one to represent 
definite, indefinite and periodic temporal information. 

Having ACL and TACLP, a natural further step is to consider the addition 
of spatial information. A first proposal can be found in [12]. In such an approach 
an object is modeled by a predicate and its spatial extent is represented by 
adding variables denoting the spatial coordinates as arguments and by placing 
constraints on such variables. Even if this kind of spatial representation is com- 
mon [1,4,8], it is somewhat awkward in the context of TACLP: While temporal 
information is represented by annotations, spatial information is encoded into 
the formulae. The result is a mismatch of conceptual levels and a loss of simplic- 
ity. Consequently, in this paper we present STACLP, a framework resulting from 
the introduction of spatial annotations in TACLP. The pieces of spatio-temporal 
information are given by couples of annotations which specify the spatial extent 
of an object or of a property at a certain time period. The use of annotations 
makes time and space explicit but avoids the proliferation of spatial and tempo- 
ral variables and quantifiers. Moreover, it supports both definite and indefinite 
spatial and temporal information, and it allows to establish a dependency be- 
tween space and time, thus permitting to model continuously moving points and 
regions. 

While a lot of effort has been spent in developing extensions of logic program- 
ming languages capable to manage time [14], the logic based languages for the 
handling of spatial information only deal with the qualitative representation and 
reasoning about space (see e.g. [17]). And also the few attempts to manipulate 
time and space have led to languages for qualitative spatio-temporal represen- 
tation and reasoning [20]. On the other hand temporal [19,6] and spatial [9,15] 
database technologies are relatively mature, although, also in the database area, 
their combination is far from straightforward [3]. In this context, the constraint 
database approach [10] appears to be very promising. 

Our spatio-temporal language is close to the approaches based on constraint 
databases [1,4,8]. From a database point of view, logic programs can represent 
deductive databases, i.e. relational databases enriched with intensional rules, 
constraint logic programs can represent constraint databases [10], and thus STA- 
CLP can represent spatio-temporal constraint databases. The spatio-temporal 
proposals in [1,8] are extensions of languages originally developed to express 
only spatial data. Thus the high-level mechanisms they offer are more oriented 
to query spatial data than temporal information. In fact, they can model only 
definite temporal information and there is no support for periodic, indefinite 
temporal data as we will stress in the comparison with [8] in Section 5.1. On 
the contrary STACLP provides several facilities to reason on temporal data and 
to establish spatio-temporal correlations. For instance, it allows one to describe 
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continuous change in time as well as [4] does, whereas both [1] and [8] can repre- 
sent only discrete changes. Also indefinite spatial and temporal information can 
be expressed in STACLP, a feature supported only by the approach in [1 1] . 

Furthermore, STACLP takes advantage from the deductive power supplied by 
constraint logic programming. For instance, recursive predicates can be exploited 
to compute the transitive closure of relations, an ability not provided by the 
traditional approaches in the database field. In Section 5.4 we will illustrate how 
such an ability is relevant in network analysis, where it may be used to search 
for connections between objects. More generally, STACLP does not represent 
only data as in constraint databases but also rules. This extra feature makes 
the difference if one wants to use the language as a specification and/or analysis 
language. 

Overview of the paper. In Section 2, we shortly introduce the Temporal 
Annotated Constraint Logic Programming framework (TACLP). In Section 3, we 
present STACLP which extends TACLP by spatial annotations, and in Section 4 
we define its semantics using a meta-interpreter. In Section 5 we give several non- 
trivial examples aimed at illustrating the expressiveness of our approach. Finally, 
in Section 6, we draw some conclusions and possible directions of future work. 

2 TACLP: Time, Annotations, Constraints, Clauses 

In this section we briefly describe the syntax and semantics of Temporal An- 
notated Constraint Logic Programming (TACLP) [7], a natural and powerful 
framework for formalizing temporal information and reasoning. As mentioned 
in the introduction, TACLP is a constraint logic programming language where 
formulae can be annotated with temporal labels and where relations between 
these labels can be expressed by using constraints. In TACLP, the choice of the 
temporal ontology is free. In this paper, we will consider the instance of TACLP 
where time points are totally ordered and labels involve convex, non-empty sets 
of time points. Moreover only atomic formulae can be annotated and clauses are 
negation free. 



2.1 Time 

Time can be discrete or dense. Time points are totally ordered by the relation <. 
We denote by T the set of time points and we suppose to have a set of operations 
(such as the binary operations -k, — ) to manage such points. We assume that 
the time-line is left-bounded by the number 0 and open to the future, with the 
symbol oo used to denote a time point that is later than any other. A time 
period is an interval [r, s] with r, s G T and 0 < r < s < oo, which represents the 
convex, non-empty set of time points {t|r<t<s}^. Thus the interval [0,oo] 
denotes the whole time line. 

^ The results we present naturally extend to time lines that are bounded in other ways 
and to time periods that are open on one or both sides. 
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2.2 Annotations and Annotated Formulae 

An annotated formula is of the form A a where A is an atomic formula and a 
an annotation. There are three kinds of annotations based on time points and 
on time periods. Let t be a time point and let J = [r, s] be a time period. 

(at) The annotated formula A at t means that A holds at time point t. 

(th) The annotated formula AthJ means that A holds throughout, i.e., at every 
time point in, the time period J. A th-annotated formula is defined in terms 
of at as 

AthJ ytGJ.Aa.tt. 

(in) The annotated formula A in J means that A holds at some time point(s) - 
but we do not know exactly which - in the time period J. An in-annotated 
formula is defined in terms of at as 

A in J 3t G J. Aatt. 

The in temporal annotation accounts for indefinite temporal information. 

The set of annotations is endowed with a partial order relation C which turns it 
into a lattice. Given two annotations a and j3, the intuition is that a C /3 if a is 
“less informative” than (3 in the sense that for all formulae A, A /? A a. More 
precisely, being an instance of ACL, in addition to Modus Ponens, TACLP has 
the two inference rules below: 

Ao A( 3 ^^ j = aU /3 

The rule (G) states that if a formula holds with some annotation, then it also 
holds with all annotations that are smaller according to the lattice ordering. 
The rule (U) says that if a formula holds with some annotation a and the same 
formula holds with another annotation [3 then it holds with the least upper 
bound a U /? of the two annotations. 

2.3 Constraints 

We introduce the constraint theory for temporal annotations. Recall that a con- 
straint theory is a non-empty, consistent first order theory that axiomatizes the 
meaning of constraints. Besides an axiomatization of the total order relation < 
on the set of time points T, the constraint theory includes the following axioms 
defining the partial order on temporal annotations. 

(atth) att = th[t,f] 

(at in) atf = in[t,t] 

(th C) th [si, S2] E th [ri, T2] ri < Si, Si < S2, S2 < r^ 

(in E) in [ri, r2] E in [si, S2] ri < Si, Si < S2, S2 < r^ 

The first two axioms state that th I and in I are equivalent to at t when the 

time period I consists of a single time point Next, if a formula holds at every 

^ Especially in dense time, one may disallow singleton periods and drop the two axioms. 
This restriction has no effects on the results we are presenting. 




Spatio-temporal Annotated Constraint Logic Programming 263 



element of a time period, then it holds at every element in all sub-periods of that 
period ((th C) axiom). On the other hand, if a formula holds at some points of 
a time period then it holds at some points in all periods that include this period 
((in C) axiom). 

Next we axiomatize the least upper bound U of temporal annotations over 
time points and time periods. As explained in [7], it suffices to consider the 
least upper bound for time periods that produce another valid (non-empty) 
time period. Concretely, it is enough to compute the least upper bound of th 
annotations with overlapping time periods: 

(thU) th [si, S 2 ] U th [n, T 2 ] = th [si, T 2 ] Si < ri, n < S 2 , S 2 < r 2 



2.4 Clauses 

The clausal fragment of TACLP, which can be used as an efficient temporal 
programming language, consists of clauses of the following form: 

A G: < Cl , . . . , Cn , .^1 0 : 1 , ■ ■ ■ j Om (u, 771 ^0) 

where A is an atom (not a constraint), a and ai are (optional) temporal anno- 
tations, the Cj’s are constraints and the Bi’s are atomic formulae. Constraints 
Cj cannot be annotated. 

We conclude the introduction of TACLP with an example taken from [16]. 

Example 1. In a company, there are managers and a secretary who has to manage 
their meetings. A manager is busy if he is in a meeting or if he is out. 

&7ts7/(P) th [Ti, T 2 ] ^ in-meeting{P)th\Ti,T 2 ] 
husy\p)th[Ti,T 2 ] ^ out-of-office{P)th[Ti,T 2 ] 

Suppose the schedule for today to be the following: Smith and Jones have a meet- 
ing at 9am and at 9:30am respectively, each lasting one hour. In the afternoon 
Smith goes out for lunch at 2pm and comes back at 3pm: 

in-meeting (smith) th [9am, lOarnj. out-of-office(smith) th [2pm, Upm]. 

in-meeting(jones) th [9:30a777-, 10:30a?77.]. 

If the secretary wants to know whether Smith is busy between 9:30am and 
10:30am she can ask for busy(smith) in [9:30a?77., 10:30a777]. Since Smith is in a 
meeting from 9am till 10am, one can indeed derive that Smith is busy. Notice 
that this query exploits indefinite information: Since Smith is busy at least in 
one instant of the period [9:30a?n, 10:30ar7i], the secretary cannot schedule an ap- 
pointment for him for that period. Conversely, busy(smith) th [9:30am, 10:30am] 
does not hold, because Smith is not busy between 10am and 10:30am. 

The query (busy(smith) th [Ti, P 2 ], busy(jones) th [Ti, P 2 ]) reveals that both 
managers are busy throughout the time period [9:30am, 10am], because this is 
the largest interval that is included in the time periods where both managers 
are busy. 
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3 STACLP: A Spatio-temporal Language 

In this section we introduce an extension to TACLP where both temporal and 
spatial information can be dealt with and reasoned about. The resulting frame- 
work is called Spatio-Temporal Annotated Constraint Logic Programming (STA- 
CLP) . It is worth noticing that spatial data can be already modeled in TACLP 
by using contraints in the style of constraint databases (see [ 12 ]). However, this 
spatial representation is somewhat awkward in the context of TACLP: While 
temporal information is represented by annotations, spatial information is en- 
coded into the formulae. The result is a mismatch of conceptual levels and a 
loss of simplicity. In STACLP we overcome this mismatch, by defining a uniform 
setting where spatial information is represented by means of annotations, so that 
the advantages of using annotations apply to the spatial dimension as well. 

3.1 Spatial Annotations and Constraints 

We consider as spatial regions rectangles represented as [(xi, X2), (yi, 2/2)], where 
{xi,yi) and {x2,y2) represent the lower-left and upper-right vertex of the rect- 
angle, respectively. More precisely, [{xi,X2),{yi,y2)] is intended to model the 
region {{x,y) \ x\ < x < X2,yi < y < 2/2}^- Rectangles are the two-dimensional 
counterpart of convex sets of time points. 

Furthermore we define three spatial annotations, which resemble the three 
temporal annotations, namely atp (at point/position), thr (all throughout re- 
gion), inr (somewhere in region). The set of spatial annotations is endowed with 
a partial order relation described by the following constraint theory. 



(atp thr) 


atp(a;,y) = 


--thr[{x,x),{y,y)] 








(atp 


inr) 


atp(a;,y) = 


-- inr[{x,x),{y,y)] 








(thr 


E) 


thr [(xi,a:2), (^1,^2)] 


E thr [(a;^ , 


x'2), {y 


''i,y' 2 )]^ 








x{ < X\, 


Xi < X2, 


X2 < X2,y'i 


< yi, 


yi < y2, y2 


<V2 


(inr 


E) 


inr [(x(,a:2 


),{y[,y' 2 )] 


E inr [(a;i. 


X2), {y 


7,^2)] 








x'l < X\, 


Xi < X2, 


X2 < X2,y[ 


< yi, 


yi < y2, y2 


<V2 



3.2 Combining Spatial and Temporal Annotations 

In order to obtain spatio-temporal annotations the spatial and temporal anno- 
tations are combined by considering couples of annotations as a new class of 
annotations. Let us first introduce the general idea of coupling of annotations. 

Definition 1. Let {A, C^) and {B, Cb) &e two disjoint classes of annotations 
with their partial order. The coupling is the class of annotations {A * B, Qa*b) 
defined as follows 

A* B = {a/ 3 , ( 3 a \ a G A, (3 G B} 

7 i Qa*b 72 ((71 = ctiPi A 72 = 02 / 32 ) V (71 = ( 3 iai A 72 = / 3202 )) A 

(oi Et 02 A ( 3 i Es (32) 

® The approach can be easily extended to an arbitrary nnmber of dimensions. 
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In our case the spatio-temporal annotations are obtained by considering the 
coupling of spatial and temporal annotations. 

Definition 2 (Spatio-temporal annotations). The class of spatio-temporal 
annotations is the coupling of the spatial annotations Spat built from atp, thr 
and inr and of the temporal annotations Temp, built from at, th and in, i.e. 
Spat*Temp. 

To clarify the meaning of our spatio-temporal annotations, we present some 
examples of their formal definition in terms of at and atp. Let t be a time 
point, J = be a time period, s = (x,y) be a spatial point and R = 

[{xi,X2),{yi,y2)] be a rectangle. 

— The equivalent annotated formulae A atp s at t, Aatt atp s mean that A 
holds at time point t in the spatial point s. 

— The annotated formula A thr RthJ means that A holds throughout the time 
period J and at every spatial point in R. The definition of such a formula 
in terms of atp and at is: 

A thr RthJ Vt G J. Vs G i?. A atp s at t. 

The formula AthJ thr R is equivalent to the formula above because one can 
be obtained from the other just by swapping the two universal quantifiers. 

— The annotated formula A thr i? in J means that there exist (s) some time 
point (s) in the time period J in which A holds throughout the region R. 
The definition of such a formula in terms of atp and at is: 

A thr i? in J 3t € J. Ws G R. A atp s at t. 

In this case swapping the annotations swaps the universal and existential 
quantifiers and hence results in a different annotated formula Ain J thr i?, 
meaning that for every spatial point in the region R, A holds at some time 
point(s) in J. 

Let us point out how two formulae which are obtained one from the other 
by exchanging the order of annotations might differ. Consider, for instance, the 
following annotated formulae. 

water thr R±n [apr^jul]. water in [apr,jul] thr R. 

The first one expresses that there is a time period between April and July in 
which the whole region R is completely covered by the water. On the other hand 
the second formula states that from April to July each point in the region R will 
be covered by the water, but different points can be covered in different time 
instants. Hence there is no certainty that in a time instant the whole region is 
covered by the water. 

Consider now inr RthI and th I inr R. The annotation A inr RthI means 
that throughout the time interval I, A holds somewhere in the region R, e.g. 
A may change its position and move inside R during the time interval I. On 
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the other hand, Ath/inri? means that there exists a point in space R for 
which A holds throughout the interval I, so A is fixed during I. For instance 
the annotated atom does {john, ski) ±nr Rth [ 12 am, rneans that John is 
skiing (thus possibly moving) inside the area R from 12 am to 4 pm. Whereas 
exhihition{monet) th [oct , dec] ±nr R means that in some fixed place in the re- 
gion R there is the Monet exhibition from October to December. 

3.3 Least Upper Bound and Its Constraint Theory 

As for TACLP, besides Modus Ponens, two inference rules (C) and (U) are 
provided. Also for spatio-temporal annotations, we restrict the latter rule only 
to least upper bounds that produce valid, new annotations, i.e., the resulting 
regions are rectangles and the temporal components are time periods. Thus we 
consider the least upper bound in the following cases: 

( 1 ) thr [{xi,X2), (?/i, y2)]th [A, 12] U thr ]{xi,X2), (zi, Z2)]th [^1,^2] = 

thr [(xi,a;2), {yi, Z2)]t'h.[ti,t2] yi < Zi,zi < y2,V2 < 22 

( 1 ') axiom obtained by swapping the annotations in ( 1 ). 

( 2 ) thr ]{xi,X2), (2/1, y2)]th [A, 12] U thr [(ti, Z2), (yi, ?/2)]th [ti, ^2] = 

thr[(a;i,Z2),(yi,y2)]th[ti,t2] xi < Zi,Zi < X2,X2 < Z 2 

( 2 ') axiom obtained by swapping the annotations in ( 2 ). 

( 3 ) thr ]{xi,X2), (yi,y2)]th[si,S2] U thr ]{xi,X2), (yi,y2)]th [ri,r2] = 

thr [(xi,a;2), (yi, j/2)]th [si, t2] si < ri,n < S2,S2 < t2 

( 3 ') axiom obtained by swapping the annotations in ( 3 ). 

( 4 ) inr ]{xi,X2), (yi, y2)]th [si, S2] U inr [(a;i,a;2), (yi,y2)]th [ri,r2] = 

inr [(a;i,a:2), (yi, y2)]th [si, t2] si < ri,ri < S2,S2 < t2 

( 5 ) in[ti,t2]thr[(a;i,a;2),(yi,y2)] U in [ti, t2]thr 0:2), (ti, 2:2)] = 

in[ti,t2]thr [{xi,X2),{yi,Z2)] yi < Zi,Zi < y2,V2 < Z2 

( 6 ) in[ti,t2]thr[{xi,X2),{yi,y2)] U in [ti, t2]thr [(zi, Z2), (yi, y2)] = 

in[ti,t2]thr [(a:i, Z2), (yi, y2)] ^ xi < zi,zi < X2,X2 < Z2 

Axioms ( 1 ), ( 1 '), ( 2 ) and ( 2 ') allow one to enlarge the region in which a property 
holds in a certain interval. If a property A holds both throughout a region R\ 
and throughout a region R2 in every point of the time period / then it holds 
throughout the region which is the union of R\ and R2, throughout I. Notice 
that the constraints on the spatial variables ensure that the resulting region 
is still a rectangle. Axiom ( 3 ) and ( 3 ') concern the temporal dimension: If a 
property A holds throughout a region R and in every point of the time periods 
I\ and I2 then A holds throughout the region R in the time period which is the 
union of I\ and I2, provided that R and I2 are overlapping. By using axiom ( 4 ) 
we can prove that if a property A holds in some point (s) of region R throughout 
the time periods R and R then A holds in some point(s) of region R throughout 
the union of R and R, provided that such intervals are overlapping. Finally, the 
last two axioms allow to enlarge the region R in which a property holds in the 
presence of an in temporal annotation. 
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3.4 Clauses 

The clausal fragment of STACLP differs from that of TACLP only for the fact 
that now atoms can be labelled with spatial and/or temporal annotations. 

Definition 3. A STACLP clause is of the form: 

A ex [3 < Cl , . . . , Cn 7 (X\[ 3 \ , . . . , Bjyt, eXinPin {jl^ Tfl ^ 0 ) 

where A is an atom (not a eonstraint), ex, ai, P, Pi are (optional) temporal and 
spatial annotations, the Cj ’s are constraints and the Bi ’s are atomic formulae. 
Constraints Cj cannot he annotated. 

A STACLP program is a finite set of STACLP clauses. 

In our setting, a complex region can be represented (possibly in an approx- 
imated way) as union of rectangles. A region idreg, divided into n rectangles 
{[{x\,X2), (2/172/2)] I * = I7 • ■ • ,n}, is modeled by a collection of unit clauses as 
follows: 

resort{idreg) thr [(a;][,a;^), (2/), 2/2)]- ■ • ■ resort{idreg) thr [{xf,xlf), {yi,y2)]- 

4 Semantics of STACLP 

In the definition of the semantics, without loss of generality, we assume all atoms 
to be annotated with th, in, thr or inr labels. In fact, at t and atp (x, y) an- 
notations can be replaced with th[t, t] and thr[{x,x),{y,y)] respectively by 
exploiting the (atth) and (atp thr) axioms. Moreover, each atom in the object 
level program which is not two-annotated, i.e., which is labelled by at most one 
kind of annotation, is intended to be true throughout the whole lacking dimen- 
sion(s). For instance an atom Athr R is transformed into the two-annotated 
atom A thr i?th [ 0 , 00]. Constraints remain unchanged. 

The meta-interpreter for STACLP is defined by the following clauses: 

demo {empty). (1) 

demo{{Bi, B2)) ^ demo {Bi), demo {B 2) (2) 

demo{A aP) <— a E < 5 , /? C 7, clause{A Sj, B), demo{B) ( 3 ) 

demo{A a'P') ^ aiPi U U2P2 = <xP,a' Q a,P' C p, 

clause{A aiPi, B), demo{B), ( 4 ) 

demo{A 02/92) 

demo {C) ^ constraint {C),C ( 5 ) 

A clause A aP ^ B of a STACLP program is represented at the meta-level by 
clause{AaP, B) ^ valid {a), valid {P). (6) 
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where valid is a predicate that checks whether the interval or the region in the 
annotation is not empty. 

The first two clauses are the ordinary ones to solve the empty goal and 
a conjunction of goals. The resolution rule (clause (3)) implements both the 
Modus Ponens rule and the rule (C). It contains two relational constraints on 
annotations, which are processed by the constraint solver using the constraint 
theory for temporal and spatial annotations presented in Sections 2.3 and 3.1. 
Clause (4) implements the rule (U) (combined with Modus Ponens and rule (E))- 
The constraint aif3iLia2P2 = <a/3 in such a clause is solved by means of the axioms 
defining the least upper bound introduced in Section 3.3. Clause (5) manages 
constraints by passing them directly to the constraint solver. 



5 Examples 

In this section we present some examples which illustrate the expressiveness and 
the conciseness of STACLP. The first example shows how spatial data can be 
modeled by annotations and integrated with temporal information. This example 
is taken from [8], where objects with time- varying activities are modeled in the 
system DEDALE, a generalization of the constraint database model of [10] which 
relies on a logical model based on linear constraints. Example 2 points out some 
further peculiarities of our approach that cannot be modeled in [8]. Example 3 
and Example 4 describe how it is possible to define moving objects in STACLP. 
Finally, the last example highlights how the features offered by constraint logic 
programming improves the reasoning ability in STACLP. 



5.1 Ski Tourism 

Assume that a person is described by his/her name, the job, the activity and 
the spatial position(s) in a certain time interval. For instance, John is a tourist 
and from lam to 10am he sleeps, from 11am to 12am he has breakfast and then 
in the afternoon he goes skiing up to 4pm, while Monica is a journalist and she 
skies from noon to 4pm. This can be expressed by means of the following clauses. 

person{john, tourist). 

does{john, sleep) atp (2, 12) th [lam, lOamj. 

does{john, eat) atp (2, 6) th [Horn, 12am] . 

does\john, ski) inr [(500, 2000), (1000, 2000)] th[12am, 4pm]. 

person {monica , journalist) . 

does{monica, ski) inr [(500, 2000), (1000, 1500)] th[lpm, 4pm]. 

The temporal information is represented by a th annotation because the prop- 
erty holds throughout the time period. Instead the spatial location is expressed 
by using an atp annotation when the exact position is known, or by an inr 
annotation if we can only delimit the area where the person can be found. 
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Furthermore, a resort can be described by its name and its area represented 
by a thr annotation. 

resort (terrace) thr [(3, 5), (8, 10)]. 
resort(ski) thr [(50, 2000), (1000, 2000)]. 

Below we show how some queries from [8], involving the spatial and/or tem- 
poral knowledge, can be formulated in our language. 

1. Where is John between 12am and 2pm? 
does(john, _) inr R in [12am, 2pm] 

The answer to this query consists of (possibly different) regions where John 
stays during that time period. We use the in annotation because we want 
to know all the different positions of John between 12am and 2pm while the 
inr annotation allows one to know the region John is in during that time 
period, even if his exact position is unknown. 

If we asked for does(john, _) atp R th [12am, 2pm] then we would have con- 
strained John to stay in only one place for the whole time period. 

The query does (john, d) a.tp R±n [12am, 2pm] asks for definite positions of 
John sometime in [12am, 2pm]. 

2. When does Monica stay at the terrace? 
does(monica, _) inr R th/, resort (terrace) thr R 

The result is the time interval / in which Monica’s position is somewhere in 
the terrace area. 

3. Where is John while Monica is at the terrace? 

does(john, _) inr R th/, does(monica, _) inr R1 th/, resort (terrace) thr R1 
This query is a composition of a spatial join and a temporal join. 

4. In which places does Monica sleep? 
does(monica, sleep) thr R in [0am, 12pm] 

5. Where did Monica and John meet? 

does(john, _) atp P at T, does(monica, _) atp P at T 

This query exploits the fact that two people meet if they are in the same 
place at the same time. 

6. Who ate in the skiing area, and when? 
does(X, eat) inr R th/, resort(ski) thr R 

Grumbach et al. [8] represent only definite spatial and temporal data, corre- 
sponding to our thr and th annotations. Indeed, they model the trajectory of a 
person as a set of regions associated with time periods where the person can be 
found, a sort of indefinite spatial information. Thus, in Grumbach et al. the differ- 
ence between definite and imprecise information is blurred, whereas in our frame- 
work it can be captured by resorting to inr annotations for indefinite spatial 
data (see e.g., does (monica, ski) inr [(500, 2000), (1000, 1500)] th[lpm, 4pm]). 

Furthermore, in [8] time and space are associated with attributes of relations. 
The temporal and the spatial dimensions are independent, thus it is not possible 
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to express spatial relations parametric with respect to time. Hence, their concept 
of moving objects captures only discrete changes. We will see in Section 5.3 that, 
instead, in STACLP continuous changes can be naturally modeled. 

An advantage of [8] is that it allows for a more compact representation of 
the information by means of disjunctive constraints. On the other hand, here we 
use only conjunctive constraints, and we model disjunction by defining different 
clauses, one for each disjunct. Observe also that, for the absence of negation, 
STACLP does not allow us to express those database operations requiring nega- 
tion, like difference between relations. The treatment of negation is not straight- 
forward and represents an interesting topic of future research. 

5.2 Periodicity and Indefiniteness 

In STACLP one can express also periodic temporal information and indefinite 
data both in time and/or space. 

Example 2. Suppose that Frank works somewhere in Florence (indefinite spatial 
information) on Wednesdays (periodic temporal information) . Such situation can 
be simply expressed by the clause 

does{frank, work) inr RatT ^ wed at T, resort{florence) thr R 

The predicate wed is defined as 

wed at w. wed atT +7 ^ wed at T 

where w is the date of a Wednesday. The predicate wed holds if T is Wednesday 
and resort{florence) thr R binds R to the area of the town of Florence (repre- 
sented by a set of rectangles) . 

Moreover, to express the fact that Frank has a break some time between 5pm 
and 6pm (indefinite temporal information) at the terrace we can use the clause 

does{frank, break) inr R in [5pm, 6pm] ^ resort (terrace) thr R 



5.3 Moving Objects 

The STACLP framework allows one to describe spatial relations which may be 
parametric with respect to time, i.e., with respect to their evolution, for instance 
time- varying areas and moving points. In the following examples we assume that 
time and space are interpreted over “compatible” domains, for instance if time 
is dense then also the spatial domain has to be dense. 

Example 3. Suppose to have a rectangular area on a shore, where a tide is coming 
in. The front edge of the tide water is a linear function of time. At 1:00am the 
area flooded by the water will be a line, then it will start being a rectangle with 
a linearly growing area. We can represent such a phenomenon by the following 
clause: 

floodedarea thr [(2, 8), (2, 1 -|- T)] atT^l<T<6 
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The spatial annotation is parametric with respect to time and this allows an 
interaction between the spatial and temporal attributes. The idea is similar to the 
Parametric 2-spaghetti model proposed by Chomicki and Revesz in [4] . In such 
an approach, which generalizes the 2-spaghetti Model, objects are triangulated 
and the vertex coordinates of the triangles can be linear functions of time. 



Example 4- A moving point can be modeled easily by using a clause of the form: 

moving-point atp {X, Y) atT ^ constraint {X , Y , T) 

For instance consider a car moving on a straight road with speed v and assume 
that its initial position at time to is {xo,yo)- The position (X,Y) of the car at 
T can be computed as follows: 

car -position atp {X, Y) atT ^ X = xq + v(T — to), Y = yo + v(T — to) 

In a similar way we can represent regions which move continuously in the plane. 

5.4 Transitive Closure 

Suppose that we want to describe towns and roads of a region and inquire the 
system about the connections among them. This is a typical example of network 
analysis, that may find applications in many areas (see e.g. [21]). To do this kind 
of analysis inside our framework, we can exploit the deductive power and the 
possibility of defining recursive predicates of constraint logic programming. 

First of all we give the definition of a general predicate path, that given the 
identifiers of two objects, Oi and O 2 , a list of properties, LProp, and a time pe- 
riod [Ti , T 2 ] , returns a possible way to reach O 2 from 0\ crossing areas satisfying 
properties in LProp during the given time period. The temporal component is 
associated with the properties, since it is very common that properties vary in 
time. For instance during Winter some roads may not be available because of 
the snow, and thus, in order to find a right path, temporal information must be 
taken into account. 

path{Oi , O 2 , LProp, Acc, [O 2 1 Accj) th [Ti, T 2 ] ^ obj{Oi ) thr R, obj{02) thr R 
path{Oi , O 2 , LProp, Acc, L) th [Ti, T 2 ] ^ 

Oi yf O 2 , hasProp{0 , Prop) th [Ti, T 2 ], 

member {Prop, LProp), nonMember{0 , Acc), O yf O 2 , 

obj{0) thri?, obj{Oi) thr R, path{0 , O 2 , LProp, [OjAcc], L) 

The predicates member and nonMcmber check whether an object does or does 
not belong to a list, respectively. The fourth argument of path is an accumulator 
in which we collect the objects we have already selected during the computation, 
and the fifth argument is the list, in inverse order, of the objects crossed to reach 
O 2 from Oi. 

The meaning of the two clauses is straightforward: The first one states that 
if the two objects intersect then our search for the path is finished. Otherwise, 
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we look for an object O, different from O 2 , for which one of the properties in 
LProp holds in the time period [Ti,T 2 ]) which has not been selected yet, and 
which intersects the object Oi . Finally, we make a recursive call to find a path 
between O and 0^. 

Now we can easily find a route to go from a town to another during a certain 
time period, if this route exists, by using roads. 

route{Towni , Town 2 , L) th [Ti, T 2 ] ^ path{0i , O 2 , [road], [Oj ], L) th [Ti, T 2 ] 

The definition of the predicate route consists in specifying road as property for 
the objects we want to use to build a path from Towrii to Towri 2 . 

The main advantage of this approach is at the specification level: We declar- 
atively state what is a route without having a fixed network. In other words, the 
only information we employ is the area of the represented objects, we do not use 
predefined nodes and links between them to move from one position to another. 
This leaves a larger freedom to specify conditions that the route has to satisfy 
and it makes the approach general, and application independent. 



6 Conclusions 

We have extended the Temporal Annotated Constraint Logic Programming 
framework by adding spatial annotations, resulting in STACLP. As for TACLP, 
the approach is conceptually simple and expressive. 

The clausal fragment of STACLP can be implemented by encoding the infer- 
ence rules directly in a constraint logic programming language. We are currently 
developing an implementation of the constraint theory needed to perform the 
lattice operations on spatio-temporal annotations, by using the constraint han- 
dling rules (CHR) library of Sicstus Prolog [18]. Such an implementation is a 
straightforward extension of the already implemented TACLP framework, which 
has been proved to be tractable and efficient [7] . 

For some application areas like spatial databases and GISs, the STACLP 
framework is still not expressive enough: It lacks some set-based operations on 
geometric objects, such as set-difference and complement which would require 
the presence of negation in the language. The treatment of negation in STACLP 
is an interesting topic for future research. Furthermore, STACLP does not pro- 
vide operations for expressing topological relations. In the spirit of Egenhofer’s 
9-intersection model [5], including the definition of the boundary and of the in- 
terior of a spatial object should allow to find the topological relations existing 
between two convex objects. More work is needed in this direction. 

Finally, it would be interesting to cope with different granularities in space 
and time, a capability which is particularly relevant to support interoperability 
among systems [2,4]. For instance, since spatial and temporal information can 
be expressed in diverse measurement units (meters, feet, seconds, days), one 
could think of introducing in STACLP a notion of unit and a set of conversion 
predicates. 
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Abstract. In constraint solvers, variable and value ordering heuristics 
are used to finetune the performance of the underlying search and propa- 
gation algorithms. However, few guidelines have been proposed for when 
to choose what heuristic among the wealth of existing ones. Empirical 
studies have established that this would be very hard, as none of these 
heuristics outperforms all the other ones on all instances of all problems 
(for an otherwise fixed solver). The best heuristic varies not only between 
problems, but even between different instances of the same problem. Tak- 
ing heed of the popular dictum “If you can’t beat them, join them!” we 
devise a practical meta-heuristic that automatically chooses, at run-time, 
the “best” available heuristic for the instance at hand. It is applicable to 
an entire class of NP-complete subset problems. 



1 Introduction 



If you can’t beat them, join them! 

— Anonymous 

Constraint Satisfaction Problems (CSPs) — where appropriate values for the 
problem variables have to be found within their domains, subject to some con- 
straints — represent many real life problems. Examples are production planning 
subject to demand and resource availability, air traffic control subject to safety 
protocols, transportation scheduling subject to initial and final location of the 
goods and the transportation vehicles, etc. Many of these problems can be ex- 
pressed as constraint programs and then be solved using constraint solvers. 

Constraint solvers (such as SICSTUS CLp(fd) [2] and OPL [18]) are equipped 
with constraint propagation algorithms based on consistency techniques such 
as bounds consistency, plus a search algorithm such as forward-checking, and 
labelling heuristics, one of which is the default. To enhance the performance of 
a constraint program, a lot of research has been made in recent years to develop 
new heuristics concerning the choice of the next variable to branch on during 
the search and the choice of the value to be assigned to that variable, giving rise 
to variable and value ordering (VVO) heuristics. These heuristics significantly 
reduce the search space [16]. However, little is said about the application domain 
of these heuristics, so programmers find it difficult to decide when to apply a 
particular heuristic, and when not. 
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In order to understand our terminology, note that the phrase problem class 
here refers to a whole set of related problems, while the term problem designates 
a particular problem (within a class) , and the word instance is about a particular 
occurrence of a problem. (We here identify problems with their chosen models.) 
For example, planning is a problem class, travelling salesperson is a problem 
within that class, and visiting all capital cities of Europe is an instance of that 
problem. Much of (constraint) programming research is about pushing results 
from the instance level to the problem level, if not to the problem-class level, so 
as to get generic results. 

The difficulty of mapping the right heuristic to a given problem is mainly 
due to the following. As shown by Tsang et al. [17], there is no universally best 
solver for all instances of all problems. Thus, we are only told that a particular 
solver is “best” for the particular instances used by researchers to carry out 
their experiments. Therefore, as also noticed by Minton [14], the performance of 
solvers is instance-dependent, i.e., for a given problem a solver can perform well 
for some (distributions on the) instances, but very poorly on others. 

In such a case, conventional wisdom suggests joining the competitors, al- 
though we propose a novel way of interpreting this popular dictum: rather than 
joining efforts with the competitors (by teaming up with some of them), we ad- 
vocate joining the efforts of the competitors, thus “acquiring” some of them and 
being at the helm! But, how can this be done here, as a solver cannot know in 
what situation it is? The answer is to do the investigation at the level of problem 
classes, and to enrich the solver accordingly. 

Assuming that we have a set H of VVO heuristics (including the default one), 
we take an empirical approach to completely pre-determine a meta-heuristic 
that can decide which available heuristic in H “best” suits the instance to be 
solved, and this for any instance of any problem of the considered class. (We 
here use constraint solvers as blackboxes, thus fixing the propagation and search 
algorithms.) Such a meta-heuristic can then be added to the constraint solver. 
We here illustrate our approach with an NP-complete class of subset problems. 

This paper ^ is organised as follows. In Section 2, we discuss a class of subset 
problems and show the generic finite domain constraint store that results from 
such problems. Then, in Section 3, we present our empirical approach and de- 
vise our meta-heuristic for subset problems. Finally, in Section 4, we conclude, 
compare with related work, and discuss our directions for future research. 

2 Subset Decision Problems 

We assume that CSP models are initially written in a very expressive, purely 
declarative, typed, set-oriented constraint programming language, such as our 
ESRA [6], which is designed to be higher-level than even OPL [18]. We can auto- 
matically compile ESRA programs into lower-level finite-domain constraint lan- 
guages such as CLp(fd) or OPL. The purpose of this paper is not to discuss how 
this can be done, nor the syntax and semantics of ESRA. 

This paper is an extension of the unrefereed [llj. 



1 
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In the class of subset {decision) problems, a subset S' of a given finite set T 
has to be found, such that S satisfies an (open) constraint g, and an arbitrary 
two distinct elements of S satisfy an (open) constraint p. In ESRA, we model this 
as (a sugared version of) the following (open) program: 

VT, S : set{a) . 

subset{T, S) ^ S CT A g{S) A {subset) 

Vi, j : a . i e S A j e S A i j ^ p{i,j) 

The only open symbols are type a and the constraints g and p (as C, g, and yf are 
primitives of ESRA, with the usual meanings). This program has as refinements 
(closed) programs for many problems, such as finding a clique of a graph (see 
below), set covering, knapsack, etc. For example, the (closed) program: 

Vy, C : set{a) . Vif : set{a x a) . 
clique 5 {{V, E) , C) ^ C C V A size{C,5) A {clique^) 

\/i, j : a . i G C A j G C Ai ^ j {i,j) G E 

is a refinement of subset, under the substitution: 

VC : set{a) . g{C) ^ size{C, 5) 

(<t) 

VC : set{a x a) . Vz, j : a . p{i,j) ihj) G E 

where size is another primitive of ESRA, with the obvious meaning. It is a pro- 
gram for a particular case of the clique problem, namely finding a clique (or: a 
maximally connected component) of an undirected graph (which is given through 
its vertex set V and edge set E), such that the size of the clique is 5. 

At a lower level of expressiveness, ESRA subset problems can be compiled into 
finite-domain constraint programs. The chosen representation of a subset S' of a 
given finite set T (of n elements) is a mapping from T into Boolean variables (in 
{0, 1}), that is we conceptually maintain n couples {Ti,Bi) where the (initially 
non-ground) Boolean Bi expresses whether the (initially ground) element Ti of 
T is a member of S or not: ^ 

\/T,:a.T,GT^{B,^T,GS) (1) 

This Boolean representation of sets is different from the set interval representa- 
tion of CONJUNTO [8] and OZ [15], but both have been shown to create the same 
0(2”) search space [8]. Moreover, the set interval representation does not allow 
the definition of some (to us) desirable high-level primitives, such as universal 
quantification over elements of non-ground sets. 

Given this Boolean representation choice for sets, the open constraints g 
and p of subset can easily be re-stated in terms of finite-domain constraints on 
Boolean variables. As shown in [5], it is indeed easy to write constraint-posting 
programs for G, C, size, and all other classical set operations. 

We here pay special attention to the NP-complete class where g only con- 
strains the size of the subset to be a certain constant, and where p is not true.^ 

^ In formulas, atom Bi is an abbreviation for atom Bi = 1. 

® When g is true, subset problems can be reduced, in our representation, to 2-SAT 
(satisfiability of sets of 2-literal clauses), which is in P. 
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Having a good (meta-)heuristic for one NP-complete problem Q is of great prac- 
tical interest, as some other NP problems can be reduced, in polynomial time 
and without loss of properties, to Q, so that the (meta-)heuristic is also usefully 
applicable to them. 

Restricting the size of the subset to be a given constant, say k, can be written 
as the following n-ary constraint: 



Y,B, = k (2) 

i=l 

Let us now look at the remaining part of subset, which expresses that any two 
distinct elements of the subset S' of T must satisfy a constraint p: 



S C T AW^,Tj : a . T, e S ATj e S AT, Tj ^ p{Ti, T,) 



This implies: 



VTi, Tj : a.Ti G T ATj GT AT, G S ATj G S AT, y^Tj 



P{T„Tj) 



which is equivalent to: 



VT„T, 



TiGT ATj gT ATi^Tj A ^p(T„ Tj) 



^{Ti gSATjGS) 



By (1), this can be rewritten as: 



WT,, Tj : a . T, G T A Tj G T A Ti ytz Tj A ^p{T„ Tj) ^{B, A Bj) 



Thus, for every two distinct elements Ti and Tj of T, with corresponding Boolean 
variables Bi and Bj, if p(Tj, Tj) does not hold, we just need to post the following 
binary constraint: 

^{B, A By) (3) 

It is crucial to note that the actually posted finite-domain constraints are thus 
not in terms of p, hence p can be any ESRA formula and our approach works for 
our whole class of subset problems! Indeed, the reasoning above was made for 
the (open) subset program rather than for a particular (closed) refinement such 
as clique^. 

Therefore, the finite-domain constraint store for any subset problem of the 
considered class is over a set of (only) Boolean variables and contains an instance- 
dependent number of binary constraints of the form (3) (if p is not true) as well 
as a summation constraint of the form (2)."^ Our results are thus “best” applied 
only in the NP class where p is not true and g is only a size constraint. Extending 
our results to other contents of g is only a matter of time, but the considered 
class is NP-complete and thereby our results are already significant. 

^ Having obtained what is largely a binary CSP (over Boolean variables) is a mere 
coincidence, and irrelevant to our approach. 
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3 A Meta-heuristic for Subset Decision Problems 



We now present our approach for devising a meta-heuristic for the entire pre- 
sented class of subset problems, describe the used experimental setting, and show 
how to use the obtained results to devise a meta-heuristic for that class. 



3.1 Approach Taken 

On the one hand, we are able to map all subset problems of the considered class 
into a generic finite-domain constraint store, parameterised by the number n of 
Boolean variables involved (i.e., the size of the given set), the subset size fc, and 
the number b of binary constraints of the form (3).® On the other hand, an ever 
increasing set H of WO heuristics for CSPs is being proposed. 

Our approach is to first measure the median cost (in CPU time and in num- 
ber of backtracks) of each heuristic, for a fixed finite-domain solver, on a large 
number of instances with different values for (n, fc, 6). Then we try and deter- 
mine the range of instances (in terms of (n, fc, 6)) for every heuristic in which it 
performs “best,” so as to devise a meta-heuristic that always picks the “best” 
heuristic in Ti. for any instance. Note that these measures, and hence the meta- 
heuristic, are thus entirely made off-line and once- and- for- all, for all instances 
of all problems of the whole (and large) subset problem class. Compared to a 
hardwired choice of a heuristic, the run-time overhead of the meta-heuristic for a 
particular instance will only consist of counting the number b of actually posted 
binary constraints of the form (3) and then looking up which heuristic to use. 
For all but the most trivial instances, this overhead is negligible, because the 
calculations are easy. 

Our approach rests on the assumption that all instances of the same (n, k, b) 
family® will benefit from the heuristic chosen for the instance that had the me- 
dian cost. More analytical and empirical work is of course needed to better un- 
derstand and model the variance in behaviour inside a family, and to understand 
whether (n, k, b) is an effective characterisation of subset problem instances. 

To illustrate the idea, let us assume that we have just two heuristics. Hi and 
H 2 say. If we keep n and b constant, we can measure the costs of both heuristics 
for all values of k. The illustrative plot (from made-up data) in Figure 1 suggests 
the following meta-heuristic: 

if fc G 1..3 then choose Hi 
if k G 3.. 5 then choose H 2 
if fc G 5..n then choose Hi 

However, in our case, the problem is more difficult because we have three varying 
dimensions rather than just one, namely n, k, and b. 

® It is the instance data that determine, after posting all constraints, the value of b. 

® A family (of instances) is not to be confused with a class (of problems). 
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Fig. 1. Typical median-cost curve in terms of k for two heuristics 



3.2 Experimental Setting 

Heuristics Chosen. For the purpose of this paper, we focused on three VVO 
heuristics only, as we would just like to show here that the principle works. 
More VVO heuristics can easily be added to the experiments, if given more 
time. We also generated random instances in a coarse way (by not considering 
all possible combinations of {n,k,b) up to a given n); again, given more time, 
instances generated in a more fine-grained way could be used instead and help 
make our results more precise. Finally, we calculated the median cost of only 5 
instances for each (n, k, b) family, in order to offset the impact of exceptionally 
hard instances and failed searches; given more time, many more instances should 
be generated for each {n,k,b). We used the following three VVO heuristics: 

— The default VVO heuristic, here named H\, labels the leftmost variable in 
the sequence of variables provided, and the domain of the chosen variable is 
explored in ascending order. 

— The static VVO heuristic, here named H 2 , pre-orders the variables in as- 
cending order, according to the number of constraints in which a variable is 
involved, and then labels the variables according to that order by assigning 
the value 1 (for true) first (as we need only consider the Boolean domain). 

— The dynamic VVO heuristic, here named H^, says that the next variable is 
chosen in a way that maximises the sum of the promises [7] of its values, and 
that it is labelled with the minimum promising value. 
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The default VVO heuristic does not introduce any overhead. The static one has 
a pre-processing overhead, while the dynamic one is the most costly one, as it 
incorporates calculations at each labelling step. We tested the effect of these 
heuristics by fixing the propagation and search algorithms, namely by using the 
ones of SICSTUS clp(fd) [2]. 

Instance Characterisation. The finite-domain constraint store for any subset 
problem is over a set of Boolean variables and contains an instance-dependent 
number of binary constraints as well as a summation constraint. For binary 
CSPs, a family of instances is usually characterised by a tuple (n, m,pi,p 2 ) [17], 
where n is the number of variables, m is the (assumed constant) domain size 
for all variables, pi is the (assumed constant) constraint density, and p 2 is the 
(assumed constant) tightness of the individual constraints. 

In our experiments, the variable count n is the number of Boolean variables; 
we varied it over the interval 10. .200, by increments of 10. The domain size m is 
fixed to 2 as we only consider the Boolean domain {0, 1} in subset problems, so 
that m can be discarded. The constraint density pi is ’ ’''^here b is the 

number of actually posted binary constraints; rather than varying b (as initially 
advocated), we varied the values of pi, using the interval 0.1..1, by increments 
of 0.1, as this also leads to an interval of b values. Since the considered binary 
constraints are of the form ~^{Bi A Bj), their tightness is always equal to | 
and need thus not be varied. The tightness of the summation constraint however 
varies, as it is (^) /2”, where k is the desired size of the subset; instead of varying 
the values of p 2 , we varied (as initially advocated) the values of k, over the 
interval l..n, by increments of 1, as this also leads to an interval of p 2 values. 
In any case, varying p 2 by a constant increment over the interval 0..1 would 
have missed out on a lot of values for k; indeed, when k ranges over the integer 
interval l..n, the corresponding values of p 2 do not exhibit a constant increment 
within 0..1. Hence we used {n,pi,k) to characterise instance families. 



Experiments. Having thus chosen the intervals and increments for the param- 
eters describing the characteristics of families of instances of subset problems, 
we randomly generated many different instances and then used the three chosen 
heuristics in order to find the first solution or prove that there is no solution. 
Some of the instances were obviously too difficult to solve or disprove within a 
reasonable amount of time. Consequently, to save time in our experiments, we 
used a time-out on the CPU time. Hence our meta-heuristic can currently not 
select the “best” available heuristic for a given instance family when all three 
heuristics timed out on the 5 instances we generated. 

The obtained results are reported, in Table 1, as (n,pi, A:, ci, C2, C3) tuples, 
where Ci is (here) the median CPU-time for heuristic i. The scale of the timings, 
as well as the used hardware and software platforms are irrelevant, as we are only 
interested in the relative behaviour of the heuristics. We can see that indeed no 
heuristic outperforms all other heuristics, or is outperformed by all the others, 
or never outperforms all the others. Moreover, the collected costs look very 
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n 


Pi 


k 


Cl 


C2 


C3 














100 


0.2 


6 


40 


970 


2030 














110 


0.2 


22 


time-out 


20 


1880 














130 


0.3 


18 


time-out 


10250 


5150 















Table 1. Tabulated results of the experiments 



unpredictable and have many outlyers. This confirms Minton’s and Tsang et al.’s 
results, and also shows that human intuition may break down here (especially 
when dealing with blackbox solvers) . 

In order to analyse the effects of each heuristic on different instances, we 
drew various charts from Table 1, for example by keeping n and pi constant 
and plotting the costs for each k. Figure 2 shows an example of the CPU-time 
behaviours of the three heuristics on the instances where n = 110 and pi = 0.4. 

From the results of the empirical study, we can already conclude the following, 
regarding subset problems: 

— As k gets smaller, for a given pi and n, the default VVO heuristic almost 
always outperforms the others. 

— As fc gets larger, for a given pi and n, the performance of the default VVO 
heuristic degenerates, but the static and dynamic VVO heuristics behave 
much more gracefully (see Figure 2). 

— Even though it is very costly to apply the dynamic VVO heuristic, it some- 
times outperforms the other two heuristics. 

— For some of the instances, all the heuristics failed to find a solution, or prove 
the non-existence of solutions, within a reasonable amount of time. 



3.3 The Meta-heuristic 

Designing the Meta-heuristic. Using the obtained table as a lookup table, it 
is straightforward to devise a (non-adaptive) meta-heuristic that first measures 
the parameters (n,pi, k) of the given instance, and then uses the (nearest) cor- 
responding entry in the table to determine which heuristic to actually run on 
this instance. Considering the simplicity of these measures, the (constant) run- 
time overhead is negligible, especially that it nearly always pays off anyway. The 
meta-heuristic thus gives rise to an instance-independent program that is guar- 
anteed to run, for any instance, (almost exactly) as fast as the fastest considered 
heuristic for its instance family. 
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k 



Fig. 2. Median cost in terms of k for the three heuristics on n = 110 and pi = 0.4 



Improving the Meta-heuristic. A very important observation is that, for 
many (n,pi, k) families, all instances can be shown to have no solution, so that 
the best heuristic is to fail immediately, and there is no need to even choose 
between the actual heuristics in the lookup table. This is here the case when the 
following holds: 




where n is the number of Boolean variables, b is the number of binary constraints 
of the form and k is the size of the desired subset. Note that b also is 

the number of 2-combinations of Boolean variables that cannot simultaneously 
be 1 (which here stands for true), so ( 2 ) — b is the number of 2-combinations 
of Boolean variables that can simultaneously be 1. As fc also is the number of 
Boolean variables that must simultaneously be 1, we have that ( 2 ) combinations 
of Boolean variables must be simultaneously 1. Therefore, if ( 2 ) — b is strictly 
less than ( 2 ), then no solution exists. 

This can be exploited by overwriting some entries in the lookup table, or, bet- 
ter, by reducing the number of experiments and then adding the corresponding 
fail entries to the lookup table. This leads to our meta-heuristic being sometimes 
strictly (and possibly significantly) faster than all the underlying heuristics, if 
not faster than any other heuristic! This only became possible because we (need 
to) detect the family to which the current instance belongs. 
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Fig. 3. Proposed extension to the classical quest for solvers 



We have furthermore made a regression analysis to derive an evaluation func- 
tion, instead of using the full look-up table. This does not speed up the resulting 
programs, but the size of the solver shrinks dramatically, as the look-up table 
does not have to be incorporated. 



Methodological Contribution. The classical approach (shown in full lines 
in Figure 3) to detecting good solvers starts by composing several solvers from 
available propagation algorithms, search algorithms, and heuristics. Random in- 
stances of binary CSPs are then generated for a given problem and/or given 
bounds and increments for the {n,m,pi,p 2 ) parameters that govern binary 
CSPs. Running the composed solvers for the generated instances yields solving- 
cost statistics, which are always evaluated manually. 

Our approach (shown in dashed lines in Figure 3) extends this scenario 
by making the obtained statistics an input to a new process, namely meta- 
composition of solvers, so as to build a problem-and-instance-independent but 
problem-class-dependent solver that is guaranteed to outperform all the other 
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ones. Also, instances are here just generated for the considered problem-class- 
specific CSP (which is not necessarily a binary CSP). 

4 Conclusion 

4.1 Summary 

We have shown how to map an entire class of NP-complete CSPs to a generic 
constraint store, and we have devised a class-specific but problem-independent 
meta-heuristic that chooses a suitable instance-specific heuristic. This work is 
thus a continuation of Tsang et aVs research [17] on mapping heuristics to 
application domains, and an incorporation of Minton’s and Tsang et al.’s findings 
about the sensitivity of heuristics to (distributions of) instances. The key insight 
is that we can analyse and exploit the form (and number) of the actually posted 
constraints for a problem class, rather than considering the constraint store a 
black box and looking for optimisation opportunities elsewhere. 

The importance and contribution of this work is to have shown that some 
form of heuristic, even if “only” a brute-force-designed and simple meta-heuristic, 
can be devised for an entire problem class, without regard to its problems or their 
instances. Our restriction to the (NP-complete) class of subset problems where g 
only constrains the size of the subset and p is not true was just made to simplify 
our presentation, as we only aimed at proving the existence of meta-heuristics 
for (useful) problem classes. 

Considering the availability of such a meta-heuristic, programmers can be 
encouraged to model their CSPs as subset problems rather than in a different 
way (if this possibility arises at all). Indeed, they then do not have to worry 
about which heuristic to choose, nor do they have to implement it, nor do they 
have to document the resulting program with a disclaimer stating for which 
(distribution of) instances it will run “best.” All these non-declarative issues 
can thus be taken care of by the solver, leaving only the declarative issue of 
modelling the CSP to the programmers, thus extending the range and size of 
CSPs that they can handle efficiently. Further advances along these lines will 
bring us another step closer to the holy grail of programming (for CSPs). 



4.2 Related Work 

This work follows the call of Tsang et al. for mapping combinations of solver 
components to application domains [17]. However, we here focused on just one 
application domain (or: class of problems), as well as on just the effect of WO 
heuristics while keeping the solver otherwise constant. 

Also closely related to our work is Minton’s multi- TAG system [14], which 
automatically synthesises an instance-distribution-specific solver, given a high- 
level model of some CSP and a set of training instances (or a generator thereof) . 
His motivation also was that heuristics depend on the distribution of instances. 
However, we differ from his approach in various ways: 
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— While good performance of a solver synthesised by multi- TAG is only guar- 
anteed for the actual distribution of the given training instances, we advocate 
the off-line brute-force approach of generating all possible instance families 
for a given problem class and analysing their run-time behaviours towards 
the identification of a suitable meta-heuristic that is guaranteed to choose 
the “best” available heuristic for any considered instance. 

~ While MULTI-TAG uses a synthesis-time brute-force approach to generate 
candidate problem-specific heuristics, we only choose our heuristics from 
(variations of) already published ones. 

— While it is the responsibility of a multi- TAG user to also provide training 
instances (or an instance generator plus the desired distribution parameters) 
in order to synthesise an instance-distribution-specific program, our kind of 
meta-heuristic can be pre-computed once and for all, in a problem-and- 
instance-independent way for an entire class of problems, and the user thus 
need not provide more than a high-level problem model. 

— While MULTI- TAG features very long synthesis/compilation times for each 
problem, our approach is to eliminate them by pre-computing the results for 
entire problem classes. 

A similar comparison can be made with Ellman’s DA-MSA system [3]. 

Our work differs from the problem complexity (as opposed to algorithm com- 
plexity) work of Williams and Hogg [19] as follows. Whereas they propose an 
analytical approach of charting the search space, under any search algorithm, 
towards predicting the location of hard instances and the fluctuations in solving 
cost, we propose an analytical approach of first charting the constraint store 
and then actually determining, with an empirical approach, the same things, 
also under any search algorithm. Furthermore, their kind of analysis has to be 
repeated for every problem, while our approach can be deployed onto an entire 
problem class. 

4.3 Future Work 

As our approach rests on the assumption that all instances of an (n,pi,k) (or, 
equivalently, {n, k, b)) family will benefit from the same heuristic, namely the one 
chosen for the instance that had the median cost, more analytical and empirical 
work is needed to better understand and model the variance in behaviour inside 
a family, and to understand whether (n, k, b) is an effective characterisation of 
subset problem instances. 

We currently investigate the design of adaptive meta-heuristics [1,9] that 
choose a (possibly different) heuristic after each labelling iteration, based on the 
current sub-problem, rather than sticking to the same initially chosen heuristic 
all the way. The hope is that the performance would increase even more. In [12], 
we explain why the heuristic H 2 of this paper often outperforms all the other 
considered ones (and many others) when there is a solution. This allowed us to 
design, in [13], a first adaptive meta-heuristic for subset problems, and we now 
try to integrate it with the (non-adaptive) meta-heuristic ideas of this paper. 
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We should also produce solving-cost statistics in a more finegrained way (with 
more than 5 instances of more {n,pi, k) families until some realistic upper bound 
for n) and involve more known heuristics, so as to finetune the lookup-table for 
our meta-heuristic. Our principle of joining heuristics into a meta-heuristic, for 
a given problem class, can be generalised to all solver components, leading to a 
joining of entire solvers into a meta-solver, for a given problem class (as already 
shown in Figure 3). All this is just a matter of having the (CPU) time to do so. 

Other meta-heuristics for different classes of subset problems will be devised, 
for cases where g has other constraints than the size of the subset. The studied 
class of subset problems can be generalised into the class of s-subset problems 
(where a maximum of s subsets of a given set have to be found, subject to some 
constraints) [10]. Another extension is the coverage of (s-)subset optimisation 
problems, instead of just the decision problems studied here. 

Finally, we are planning to investigate other classes of problems, namely map- 
ping problems (where a mapping between two given sets has to be found, subject 
to some constraints) [4], permutation problems (where a sequence representing 
a permutation of a given set has to be found, subject to some constraints) [4], 
and sequencing problems (where sequences of bounded size over the elements of 
a given set have to be found, subject to some constraints) [6], or any combina- 
tions thereof [6], in order to derive further meta-heuristics. These will be built 
into the compiler of our ESRA constraint modelling language [6], which is more 
expressive than even OPL [18]. This will help us fulfill our design objective of 
also making ESRA more declarative than OPL, namely by allowing the omission 
of a VVO heuristic, without compromising on efficiency compared to reasonably 
competent programmers. 
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Abstract. In this paper we study the use of parallelism to speed up exe- 
cution of Answer Set Programs (ASP). ASP is an emerging programming 
paradigm which combines features from constraint programming, logic 
programming, and non-monotonic reasoning, and has found relevant ap- 
plications in areas such as planning and intelligent agents. We propose 
different methodologies to parallelize execution of ASP programs, and 
we describe a prototype which exploits one form of parallelism. 



1 Introduction 

In recent years we have witnessed an increasing interest in the development 
of programming paradigms centered around the notion of knowledge. These 
paradigms strive to reduce a substantial part of the programming process to 
the description of objects comprising the domain of interest and relations be- 
tween these objects. Knowledge can also be updated, modified, and used to 
make decisions on actions which need to be taken to achieve certain goals [2]. 
Important role is played by non-monotonic logics [2], which allow new axioms 
to retract existing theorems, and result more adequate for common-sense rea- 
soning and modeling dynamic knowledge bases. One of the outcomes of research 
in non-monotonic logics is represented by the development of a number of lan- 
guages for knowledge manipulation. In particular, a novel programming paradigm 
has arisen, called Answer Sets Programming (ASP) [11,12], which builds on the 
mathematical foundations of logic programming and non-monotonic reasoning. 
ASP offers highly declarative solutions in a number of well-defined application 
areas, including intelligent agents and planning. In spite of the considerable 
interest that ASP has recently attracted, it still lacks some of the fundamen- 
tal components needed to assert it as a practical programming paradigm. ASP 
currently benefits from solid mathematical foundations, but its programming 
aspects still require considerable research. There is the need to (i) develop ef- 
ficient inference engines for ASP, and (ii) develop methodologies for software 
development in ASP. Indeed, many of the research teams are currently investing 
considerable effort in these directions [5,7,13,15]. In this paper we present some 
preliminary ideas to improve performance of ASP engines through the use of 
parallelism. 
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2 Answer Set Programming 

Answer Set Semantics (AS) [8] was designed in the mid 80s as a tool to pro- 
vide semantics for logic programming with negation. Traditional logic program- 
ming provides the ability to derive only positive consequences from a program. 
However, in a large number of cases it is useful to reason also about negative 
consequences, by allowing negative knowledge to be inferred and allowing nega- 
tive assumptions in the rules. The introduction of negation in logic programming 
leads to the loss of a key property of logic programming: the existence of a unique 
intended model for each program. In standard logic programming there is no am- 
biguity in what is true and false w.r.t. a given program. This property does not 
hold true anymore when negation is allowed in the programs — i.e., programs 
may admit distinct independent models. Various proposals have been developed 
to provide semantics to logic programs with negation (e.g., [2,17]). In particular, 
one class of proposals allows the existence of a collection of intended models {an- 
swer sets) for a program [8]. Answer Sets Semantics (AS) (also known as Stable 
Models Semantics) is the most representative approach in this class. AS relies on 
a simple definition: Given a program P and given a “tentative” model M, we can 
define a new program by removing all rules containing negated atoms which 
are contradicted by the model M, and removing all the negated atoms from the 
remaining rules. Thus, P^ contains only those rules of P that are applicable 
given the model M. P^ is a standard logic program without negation, which 
admits a unique intended model M' . M is an answer set (or stable model) if 
M and M' coincide. Intuitively, an answer set contains only those atoms which 
have a justification in terms of the applicable rules in the program. 

ASP - A Novel Paradigm: the adoption of AS requires a paradigm shift to 
reconcile the peculiar features of AS with the traditional operational view of 
logic programming [11,12]. In particular, under AS, each program may admit 
more than one intended model. This ends up creating an additional level of non- 
determinism — specifically a form of don’t know non-determinism — on top of the 
non-determinism typically identified in traditional logic programming. The pres- 
ence of multiple answer sets complicates the framework in two ways. We need 
to provide programmers with a way of handling multiple answer sets. One could 
attempt to restore a more “traditional” view, where a single “model” exist. This 
has been attempted, for example, using skeptical semantics [11], where a formula 
is entailed from the program only if it is entailed in each answer set. Neverthe- 
less, skeptical semantics is often inadequate — e.g., in many situations it does not 
provide the desired result, and, in its general form, may provide excessive expres- 
sive power [11]. Maintaining multiple answer sets bears also close resemblance to 
similar proposals put forward in other communities — such as the choice and wit- 
ness constructs used in the database community. The presence of multiple answer 
sets leads to a new set of requirements on the computational mechanisms used. 
The goal of the computation is not to provide a goal-directed tuple-at-a-time 
answer (i.e., a true/false answer or a substitution), as in traditional logic pro- 
gramming, but the objective is to return whole answer sets — i.e., set-at-a-time 
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answers. The traditional resolution-based control used in logic programming is 
inadequate, and should give place to different execution mechanisms. 

To accommodate for all these aspects, we embrace a different view of logic 
programming with AS, interpreted as a novel programming paradigm — that we 
will refer to as Answer Sets Programming (ASP). This term was originally cre- 
ated by Lifschitz, and nicely blends the notion of programming with the idea that 
the entities produced by the computation are answer sets. The notion of ASP 
has been advocated by others during the last couple of years [12,11]. The goal 
of an ASP program is to identify a collection of answer sets — i.e., each program 
is interpreted as a specification of a collection of sets of atoms. Each rule in the 
program plays the role of a constraint [12] on the collection of sets specified by 
the program: a generic rule Plead : — i?i, . . . , not Gi, . . . , not Gm indicates 
that, whenever B\, . . . , Bn are part of an answer set and Gi, . . . , Gm are not, 
then Plead has to be in that answer set as well. The shift of perspective from 
traditional logic programming to ASP is very important. The programmer is 
lead to think about writing programs as manipulating sets of elements, and the 
outcome of the computation is a collection of sets. This perspective comes very 
natural in a large number of application domains (graph problems, planning, 
scheduling) [10,12,11]. ASP has received great attention in the knowledge repre- 
sentation community, as it enables to represent defaults, constraints, uncertainty, 
and nondeterminism in a direct way [2] . 

A Sequential Execution Model for ASP: Various execution models have 
been proposed in the literature to support computation of answer sets and 
some of them have been applied as inference engines to support ASP systems 
[3,4,5,12,7,15]. In this project we propose to adopt an execution model which is 
built on the ideas presented in [12,13] and effectively implemented in the pop- 
ular Smodels system [13]. The choice is dictated by the relative simplicity of 
this execution model and its apparent suitability to exploitation of parallelism. 
The system consists of two parts, a compiler — we are currently using the Iparse 
compiler [13] — which is in charge of creating atom tables and performing pro- 
gram grounding, and an engine, which is in charge of computing the answer sets 
of the program. Our interest is focused on the engine component. A detailed 
presentation of the structure of the Smodels execution model [13] is outside the 
scope of this paper. Figure 1 presents an intuitive overview of the execution cycle 
for the computation of stable models.^ 

As from Figure 1, the computation of answer sets can be described as a 
non-deterministic process — needed since, in general, each program PI may ad- 
mit multiple distinct answer sets. The computation is an alternation of two 
operations, expand and choose_literal. The expand operation is in charge of 
computing the truth value of all those atoms that have a determined value in 
the current answer set (i.e., there is no ambiguity regarding whether they are 
true or false). The choose_literal is in charge of arbitrarily choosing one of 
the atoms not present in the current answer set (i.e., atoms that do not have a 
determined value) and “guessing” a truth value for it. 

The presentation has been simplihed and does not have any pretense of completeness. 



1 




Parallel Engine for Answer Set Programming 291 



function compute(ri:Program, AiLiterals) 




function expand ( IF; Ihrogram; A: Literals) 


begin 




begin 


B := expand( FI, A ); 




B := A;B’ = 0; 


while ( (B consistent) and (B not complete) ) 




while (B B ’ ) do 


1 := choose_literaI( FI, B ); 




B’ :=B; 


B := expand( FI, A u { 1 } ); 




B := apply _rules( FI, B); 


endwhile 




endwhile; 


if (B is stable model of IF ) then 




return B; 


return B ; 




end 


end 







Fig. 1. Basic Execution Cycle and Expand Procedure 



Non-determinism originates from the execution of choose_literal(77, _B), 
which selects an atom I satisfying the following properties: the atom I appears 
negated in the program U and neither I nor its negation are currently present 
in B. The chosen atom is added (with its “guessed” truth value) to the partial 
answer set and the expansion is restarted. Each non-deterministic computation 
can terminate in three different ways: (1) successfully when the current set B 
assigns a truth value to all the atoms and B is indeed an answer set of the origi- 
nal program II; (2) unsuccessfully when a conflict is detected — i.e., there exists 
an atom a which is assigned both values true and false in the current set B; (3) 
unsuccessfully when the current set B has been completely expanded (i.e., we 
have assigned a truth value to every atom without any conflict), but it does not 
represent an answer set of the program 77. This situation typically occurs when 
a positive literal^ a is introduced in B (e.g., it is guessed by choose_literal) 
but the rules of the program do not provide a “support” for the truth of a in this 
answer set. As in traditional execution of logic programming, non-determinism 
is handled via backtracking to the choice points generated by choose_literal. 
Observe that each choice point produced by choose_literal has only two alter- 
natives: one assigns the value true to the chosen literal, and one assigns the value 
false. The expand procedure mentioned in the algorithm is intuitively described 
on the right in Figure 1. This procedure repeatedly applies expansion rules to 
the current set of literals until no more changes are possible. The expansion rules 
are derived from the program 77 and allow to determine the truth value of those 
literals which have a definite value according to the partial model B [13,4]. 

3 Parallelism in ASP 

The structure of the ASP computation previously illustrated can be easily in- 
terpreted as an instance of a constraint-based computation [3], where the ap- 
plication of the expansion rules (expand) represents the propagation step of the 
constraint computation, and the selection of a literal (chooseJLiteral) repre- 
sents a labeling step. From this perspective, it is possible to identify two points of 
non-determinism in the computation: horizontal non- determinism, which arises 

If atom a is added to B, then a receives the value true in B; if not a is added to B, 
then a receives the value false in B. 
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from the choice of the next expansion rule to apply (in expand), and vertical 
non- determinism, which arises from the choice of the literal to add to the par- 
tial answer set (in choose_Literal). These two forms of non-determinism bear 
strong similarities to the forms of non-determinism recognized in logic program- 
ming [9,18]. The goal of this project is to explore avenues for the exploitation of 
parallelism from these sources of non-determinism. We will use the terms verti- 
cal parallelism to indicate the use of separate threads of computation to explore 
alternatives arising from vertical non-determinism, and horizontal parallelism 
to indicate the use of separate threads of computation to concurrently apply 
different expansion rules to a partial answer set. Preliminary experiments have 
underlined the difficulty of exploiting parallelism from ASP computations: 

• considerable research has been invested in the design of algorithms for fast 
computation of answer sets; we desire to maintain such technology; 

• the structure of the computation (seen as a search tree, where the points of 
non-determinism correspond to the nodes of the tree) can be irregular and 
ill-balanced. Size of the branches can become very small — thus requiring 
granularity control and dynamic scheduling; 

• neither of the forms of non-determinism dominates on the other — certain 
ASP programs performs few choices of literals (i.e., calls to choose Aiteral), 
while spending most of the time in doing expansions, while other programs 
explore a large number of choices. There are programs which lead directly 
to answer sets with little or no choices (e.g., positive programs), other which 
invest most of their time in searching through a large set of literals. 

ASP does not allow the immediate reuse of similar technology developed in the 
context of parallel execution of logic programs [9]. Existing mechanisms need 
to be adapted to deal with dynamic load balancing and granularity control — 
techniques have to be adopted to dynamically search for tasks and to avoid 
parallelizing small computations. In addition both vertical and horizontal par- 
allelism need to co-exist within the same system; both forms of parallelism may 
need to be (alternatively) employed during the execution of different programs, 
or even within a single program. In this paper we will focus on the exploitation 
of vertical parallelism — horizontal parallelism is still under investigation. 



4 Exploiting Vertical Parallelism in ASP 

Alternative choices of literals during the derivation of answer sets 
(choose_Literal in Figure 1) are independent and can be concurrently explored. 
Each thread can lead to a distinct answer set — thus, vertical parallelism paral- 
lelizes the computation of distinct answer sets. As ensues from the research on 
parallelization of search tree applications and non-deterministic programming 
languages [14,1,9], the design of efficient data structures to maintain the correct 
state in the different concurrent branches is essential to achieve efficient paral- 
lel behavior. Note that straightforward solutions to related problems have been 
formally proved to be ineffective, leading to unacceptable overheads [14]. 
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The architecture for a vertical parallel ASP engine that we envision is based 
on the use of a number of ASP engines (agents) which are concurrently exploring 
the search tree generated by the ASP computation — specifically, the search tree 
whose nodes are generated by the execution of the choose_Literal procedure. 
Each agent explores a distinct branch of the tree; idle agents are allowed to 
acquire unexplored alternatives generated by other agents. The major issue in 
the design of such architecture is to provide efficient mechanisms to support this 
sharing of unexplored alternatives between agents. Each node P of the tree is 
associated to a partial answer set B{P ) — the partial answer set computed in the 
part of the branch preceding P. An agent acquiring an unexplored alternative 
from P needs to continue the execution by expanding B[P) together with the 
literal selected by choose_literal in node P. Efficient computation of B{P) for 
the different nodes in the tree is a known complex problem [14] . 

Since ASP computations can be ill-balanced and irregular, we need to adopt 
a dynamic scheduling scheme, where at run-time idle agents navigate the system 
in search of available tasks. Thus, the partitioning of the available tasks between 
agents is performed dynamically and is initiated by the idle agents. This justifies 
the choice of a design where different agents are capable of traversing a shared 
representation of the search tree to acquire unexplored alternatives. 

Implementation Overview: The system is organized as a collection of agents 
which are cooperating in computing the answer sets of a program. Each agent 
is a separate ASP engine, which owns a set of private data structures, employed 
for the computation of answer sets. Additionally, a number of global data struc- 
tures, i.e., accessible by all the agents, are introduced to support cooperation 
between agents. This structuring of the system implies that we rely, in first in- 
stance, on a shared-memory architecture. The different agents share a common 
representation of the ASP program to be executed. This representation is stored 
in a global data structure. Program representation has been implemented fol- 
lowing the general data structure design proposed in [6] — proved to guarantee 
very efficient computation of standard models. This representation is summa- 
rized in Figure 2. Each rule is represented by a descriptor; all rule descriptors 
are collected in a single array, which allows for fast scan of the set of rules. Each 
rule descriptor contains, between the other things, pointers to the descriptors of 
each atom which appears in the rule — the head atom, the atoms which appear 
positive and negated in the body of the rule. 

Each atom descriptor contains information such as (i) pointers to the rules 
in which the atom appears in the head, as a positive body element, or as a neg- 
ative body element, and (ii) an atom array index. Differently from the schemes 
adopted in sequential ASP engines [6,13], our atom descriptors do not contain 
the truth value of the atom. Truth values of atoms are instead stored in a sep- 
arate data structure, called atom array. Each agent maintains a private atom 
array, as shown in Figure 2; this allows each agent to have an independent view 
of the current (partial) answer set constructed, allowing atoms to have different 
truth values in different agents. E.g., in Figure 2, the atom of index i is true 
in the answer set of one agent (Agent 1), and false in the answer set computed 
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by another agent (Agent 2). Each agent acts as a separate ASP engine. Each 
agent maintains a local stack structure (the trail) which keeps track of the atoms 
whose truth value has already been determined. Each time the truth value of an 
atom is determined (i.e., the appropriate entry in the atom array is set to store 
the atom’s truth value), a pointer to the atom’s descriptor is pushed in the trail 
stack. The trail stack is used for two purposes: (i) (during expand) the agent 
uses the elements newly placed on the trail to determine which program rules 
may be triggered for execution; (ii) a, simple test on the current size of the trail 
stack allows each agent to determine whether all atoms have been assigned a 
truth value or not. The use of a trail structure provides also convenient support 
for the exploitation of horizontal parallelism. 



Rules 





Atom Array Atom Array Atom Array 

Processor 1 Processor 2 Processor n 



Fig. 2. Representation of Rules and Atoms 



To support the exploitation of vertical parallelism, we have also introduced 
an additional simple data structure: a choice point stack. The elements of the 
choice point stack are pointers to the trail stack. These pointers are used to iden- 
tify those atoms whose truth value has been “guessed” by the choose_literal 
function. The choice points are used during backtracking: they are used to deter- 
mine which atoms should be removed from the answer set during backtracking, 
as well as which alternatives can be explored to compute other answer sets. This 
is akin to the mechanisms used in trail-based constraint systems [16]. 

The open issue which remains to be discussed is how agents interact in 
order to exchange unexplored alternatives — i.e., how agents share work. Each idle 
agent attempts to obtain unexplored alternatives from other active agents. In our 
context, an unexplored alternative is represented by a partial answer set together 
with a new literal to be added to it. In this project we have initially adopted a 
Copy-based approach to sharing: agents share work by exchanging a complete 
copy of the current answer set (both chosen as well as determined literals) and 
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then performing local backtracking (see next section) . Another important aspect 
that has to be considered is termination detection. The overall computation 
needs to determine when a global fixpoint has been reached — i.e., all the answer 
sets have been produced and no agent is performing active computation. In the 
system proposed we have adopted a centralized termination detection algorithm. 

5 Copy-Based Approach 

In the Copy-based approach, during work sharing from agent A to B, the entire 
partial answer set existing in A is directly copied to agent B. The use of copying 
has been frequently adopted to support computation in constraint programming 
systems [16] as well as to support or-parallel execution of logic programs [1,9,18]. 
The partial answer set owned by A has an explicit representation within the 
agent A', it is completely described by the content of the trail stack. Thus, copying 
the partial answer set from AtoB can be simply reduced to the copying of the 
trail stack of A to B. Once this copying has been completed, B needs to install the 
truth value of the atoms in the partial answer set — i.e., store the correct truth 
values in the atom array. To make this process possible without any further 
interaction between agents, the trail has actually been turned into a trail-value, 
where each entry stores not only the pointer to the atom but also its truth 
value. Computation of the “next” answer set is obtained by identifying the most 
recent literal in the trail whose value has been “guessed” , and performing local 
backtracking to it. The receiving agent B maintains also a register (bottom_trail) 
which is set to the top of the copied trail: backtracking is not allowed to proceed 
below the value of this register. This allows avoidance of duplicated work by 
different agents. It is possible to improve performance of the sharing operation 
by performing incremental copying, i.e., by copying not the complete answer 
set but only the difference between the answer sets in A and in ,8. A design to 
perform incremental copying has been completed but not implemented yet. 

Performance Results: In this section we present performance results for a pre- 
liminary prototype which implements an ASP engine with Copy-based vertical 
parallelism. The current prototype has been developed in C and the performance 
results have been obtained on a 14-processor Sun Enterprise. The prototype is 
capable of computing the answer sets of standard ASP programs, pre-processed 
by the Iparse grounding program [13]. The initial prototype is largely unopti- 
mized (e.g., it does not include many of the heuristics adopted in similar ASP 
engines [13]) but its sequential speed is reasonably close to that of the efficient 
Smodels system^ [13]. All performance figures presented are in milliseconds and 
have been achieved as average execution times over 10 runs on a very lightly 
loaded machine. The benchmarks adopted are programs obtained from various 
sources (all written by other researchers); they include some large scheduling ap- 
plications (sjss, reps), planners (logistics 1,2, strategic), graph problems (color), 
as well various synthetic benchmarks (T4, T5, T15, T8, P7). These benchmarks 
range in size from few tens of rules (e.g., T4, T5) to hundreds of rules (e.g., reps). 

® Comparisons made with the lookahead feature of Smodels turned off. 
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Name 


1 Agent 


2 Agents 


3 Agents 


4 Agents 


8 Agents 


10 Agents 


Scheduling (sjss) 


131823.89 


65911.94 


44236.20 


33038.57 


18182.61 


15187.08 


Scheduling (reps) 


72868.49 


36434.25 


22421.09 


18217.13 


9794.15 


6854.99 


Color (Random) 


1198917.31 


599458.66 


355761.84 


282763.52 


164010.55 


135013.21 


Color (Ladder) 


1092.80 


590.70 


401.77 


337.28 


287.58 


286.07 


Logistics (1) 


10054.70 


4429.38 


4224.67 


4206.99 


2739.70 


2674.12 


Logistics (2) 


6024.67 


3089.57 


2619.43 


2042.26 


1149.75 


1079.69 


Strategic 


13662.61 


7273.10 


5043.04 


3534.24 


2015.14 


1832.87 


T5 


81.60 


42.70 


43.90 


44.10 


48.53 


51.14 


T4 


98.20 


61.42 


54.01 


65.47 


65.06 


65.96 


T8 


3232.58 


1602.89 


1086.90 


798.38 


437.77 


362.49 


P7 


3160.00 


1847.95 


1392.07 


1078.49 


556.34 


497.64 


T15 


416.00 


208.01 


138.67 


118.17 


120.23 


122.35 


T23 


2695.81 


1395.12 


1017.61 


769.31 


402.55 


464.18 



Table 1. Copy-based Sharing: Execution Times (msec.) 



Table 1 reports the execution times (in msec.) observed for the different 
benchmarks. As can be seen from these figures, the system is capable of pro- 
ducing good speedups from most of the selected benchmarks. On the scheduling 
(sjss, reps), graph coloring, and planning (strategic, logistics) benchmarks the 
speedups are very high (often in the range 8.5-10 using 10 agents); for a number 
of benchmarks we have observed linear speedups for small number of agents (from 
2 to 5 agents). This is quite a remarkable result, considering that these bench- 
marks are very large and some produce rather unbalanced computation trees, 
with tasks having very different sizes. The apparently low speedup observed on 
the logistics with the first plan (logistics 1), is actually a positive result, since the 
number of choices performed across the computation is just 2 (thus we cannot 
expect a speedup higher than 4). On the very fine-grained benchmarks T4 and 
T5 the system does not behave as well; in particular we can observe a degrada- 
tion of speedup for a large number of agents — in this case the increased number 
of interactions between agents overcome the advantages of parallelization, as the 
different agents attempt to exchange very small tasks. In T4 we even observe a 
slow-down when using more than 8 agents. The results are quite good also for 
T8, a benchmark which produces a very large number of average-to-small size 
tasks. This was considered by the authors to be a difficult benchmark, due to 
the potential of generating a very large number of task switchings. The speedups 
reported in this case are excellent. This is partly due to the small cost, in this 
particular case, of copying, as well as the adoption of a smarter scheduling strat- 
egy, made possible by the use of copying (as discussed more extensively in the 
next section). For what concerns the benchmark P7, the situation is sub-optimal. 
P7 has a very small number of task switchings, but generates large answer sets. 
Thus, the need of copying large answer sets during sharing operations penalizes 
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the overall performance. We expect this case to become less of a problem with 
the introduction of incremental eopying techniques — i.e., the agents compute the 
actual difference between the answer sets currently present in their stacks, and 
transfer only such difference. For the fine grained benchmarks (such as T4 and 
T5) additional steps are needed in order to achieve better parallel performance. 
We have experimented with a simple optimization, which semi-automatically 
unfolds selected predicates a constant number of times, in order to create larger 
grain tasks (by effectively combining together consecutive tasks). The simple 
optimization has produced improvements, as show in Table 2. In particular, we 
can observe that the slow down effect has disappeared from the speedup curves. 

Note that the sequential overhead observed in all cases (the ratio between 
the sequential engine and the parallel engine running on a single processor) is 
extremely low, i.e., within 5% for most of the benchmarks. 



Name 


1 Agent 


2 Agents 


3 Agents 


4 Agents 


8 Agents 


10 Agents 


T5 


1.0 


1.97/1.91 


1.98/1.86 


1.98/1.85 


1.99/1.68 


1.99/1.60 


T4 


1.0 


1.92/1.60 


1.93/1.82 


1.95/1.50 


1.96/1.51 


1.98/1.49 



Table 2. Speedup Improvement using Task-collapsing Optimization (new/old) 



Hybrid Approaches to Sharing: Too further improve the parallel perfor- 
mance of our engine, we have experimented with an hybrid approach to sharing. 
In the hybrid approach two alternative strategies are available during sharing, 
and they are selected based on the characteristics of the specific sharing opera- 
tion. In our experiment we have opted for a second alternative sharing strategy 
based on Recomputation. In a recomputation scheme, instead of copying the 
complete answer set from an agent to the other, only the core of the answer set 
is transferred. The core is represented by the set of literals which have been cho- 
sen by choose_literal in the construction of the answer set. A single expand 
operation applied to the core will reconstruct the desired answer set. 

Figure 3 illustrates the differences in speedups observed by using recompu- 
tation vs. copying on some of the benchmarks. In most benchmarks the two 
approaches do not show relevant differences — the speedups observed differ only 
of a few decimal points. On the other hand, we have observed more substantial 
differences on the larger benchmarks (e.g., the sjss and reps scheduling applica- 
tions and P7). These differences arise because of the size of the copied answer 
sets. For the scheduling problems, the copied answer sets are very large; in this 
case the cost of performing a memory copying operation of a very large block 
of memory is substantially smaller than the cost of recomputing the answer set 
starting from its core of chosen literals. Some experimental results have indicated 
that for answer sets with less than 350 elements recomputation provides better 
results than copying, while for answer sets larger than this threshold copying 
overcomes recomputation. In benchmarks such as sjss and reps the answer sets 
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exchanged have an average size of 2500 elements. This observation is confirmed 
by the behavior observed in benchmarks such as P7: in this case, the answer 
sets exchanged have sizes in the order of 300 elements, and, as we can see from 
Figure 3, recomputation indeed performs better than copying. 

The main conclusion we can draw is the following: for programs dealing with 
large answer sets, copying is better than recomputation. In the current prototype 
we automatically select the sharing strategy based on the size of the answer set, 
thus attempting to subsume the advantages of both methods. 



Copying vs. Recomputalion 



Copying vs. Recomputation 




Number of Agents 




Fig. 3. Comparison between Recomputation and Copying 



6 Scheduling Issues 

In our system, two scheduling decisions have to be taken by each idle processor 
in search for work: (i) from which agent work will be taken, and (ii) which 
unexplored alternative will be taken from the selected agent. In the current 
prototype, we have tackled the first issue by maintaining a work-load count (i.e., 
number of local unexplored alternatives) for each agent and attempting to take 
work from the agent with the highest work-load. This simple scheme has proved 
to work well in practice. The second decision turned out to be more complicated 
and to have a deeper impact on the performance. Our experiments indicates that 
the choice of which unexplored alternative to select (i.e., which choice point to 
steal from another agent) may lead to substantial variations in performance. 

In our experiments we have considered two approaches to this problem. In 
the first approach, agents are forced to steal the first choice point (i.e., the 
oldest choice point) from another agent (we call this approach Top scheduling) . 
This technique was expected to perform well since: (i) detecting the first choice 
point is a fast operation; (ii) selecting the first choice point reduces the size of 
the partial answer set transferred between agents; (Hi) if the computation tree is 
balanced, then by taking the first choice point we should minimize the frequency 
of sharing operations. The alternative technique considered is the dual of the 
one described: at each sharing operation the last choice point created is taken 
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{Bottom scheduling) . This approach is expected to have the following advantage: 
with simple modifications to the backtracking scheme, it allows to share at once 
not just a single choice point but a collection of them — e.g., all the choice points 
owned by an agent. On the other hand, the cost of sharing work under this 
scheme is higher, since larger answer sets have to be exchanged. 

The implementation of the first method is relatively simple; the first choice 
point is easily detected (by keeping an additional register in each agent for 
this purpose). The choice point indicates the segment of trail that has to be 
transferred to the other agent. The second method has been realized as follows: 
The last choice point is easily detected as it lies on the top of the choice point 
stack; this allows to determine immediately the part of the trail to be copied. To 
allow sharing of multiple choice points at once, we push on the choice point stack 
a special choice point, which simply represents a link to a choice point lying in 
another processor’s stack. This allows the backtracking activity to seamlessly 
flow between choice points belonging to different agents. 

We have implemented both schemes and compared them on the selected 
pool of benchmarks. Figure 4 compares the speedups achieved using the two 
scheduling methods in the Copy-based engine. The results clearly indicate that 
the second scheme (Bottom scheduling) is superior in the large majority of the 
cases. Particularly significant are the differences in the sjss and the graph col- 
oring problems. These are all programs where a large number of choice points 
are created; the bottom scheduling scheme allows to share in a single sharing 
operation a large number of alternatives, thus reducing the number of scheduling 
interactions between agents. The Top scheduling scheme provides better perfor- 
mance in those benchmarks where either there are few choices (e.g., T15) or the 
choices tend to be located always towards the beginning of the trail stack (T8). 

Also in this case we can clearly identify a preferable scheme (the Bottom 
scheduling scheme); nevertheless a mixed approach which selects alternative 
scheduling policies depending on the structure of the program or the structure 
of the current answer set is likely to provide superior performance. 



Scheduling 

(Top vs. Bottom) 




Scheduling 

(Top vs. Bottom) 




Scheduling 

(Top vs. Bottom) 




Number of Agents 



Number of Agents 



Number of Agents 



Fig. 4. Scheduling: Top vs. Bottom Scheduling 
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7 Prom Shared to Distributed Memory 

The experimental observations performed so far have underlined the potential for 
good performance from parallel execution of ASP programs. The natural ques- 
tion that comes to mind is whether the satisfactory behavior of these parallel 
computations can be sustained in the context of a Distributed Memory architec- 
ture (DMM). The reason to consider a move towards DMMs is the scalability 
potential offered by such architectures; furthermore, an increasing number of 
DMMs is becoming available at a very affordable cost (compared to the cost of 
similar size shared memory machines), thanks to the availability of convenient 
interconnection networks (e.g., Myrinet) and the use of off-the-shelves compo- 
nents (e.g., standard Pentium-based boxes). 

We have performed the first step in porting our parallel ASP engine from 
shared memory machines (i.e., the Sun Enterprise used for the experiments de- 
scribed earlier) to a DMM system. The DMM architecture used is a 32-node 
Pentium-333 Bewoulf system, built using Myrinet-SAN switches. 

The development of a DMM engine for ASP started from the copy-based 
version of our shared memory engine. The original design of the engine required 
relatively few changes during the process of porting to DMM. In the shared 
memory version the ASP program is a shared entity (with the exception of the 
separate arrays to store atoms’ truth values), while in the DMM version each 
processor, during the initial setup phase, reads the input program and builds 
the internal program representation. In turn, the copying operations need to 
be replaced with corresponding send-receive operations. Finally, in the shared 
memory version, scheduling is initiated by idle agents, which take charge of di- 
rectly accessing the data areas belonging to other agents in search of unexplored 
tasks. In the DMM version this has to be completely replaced by an alternative 
scheduling methodology. The first two aspects are very simple to handle — indeed, 
the design of the code allowed us to adapt the code in a few hours time. The 
last point is more complex — dynamic scheduling on DMMs is a relatively open 
issue. In the current prototype we have adopted a simple centralized scheduling 
approach. One agent is dedicated exclusively to the task of distributing work 
between idle and active agents, as well as recognizing global termination. Each 
agent lazily maintains a work-load array, which keeps track of the load of each 
agent. The central agent uses the work-load array to match idle and active agents. 
Each entry in the work-load array is an estimate of the load of an agent, and it 
is associated to a time-stamp stating the last point in time when the entry was 
updated. The work-load array is piggy-backed to each message exchanged, and 
each agent combines its own work-load array with the received one (selecting the 
most recent entries according to time stamps) to update the view of the system. 

Performance Results: We have executed a number of benchmarks using the 
distributed version of the ASP engine. The benchmarks considered are a subset of 
those used to evaluate the shared memory engine. In particular we have focused 
on those benchmarks which seem to offer tasks with larger grain, where there 
is better hope to use parallelism to offset the higher communication costs. The 
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results in terms of execution times (/iS.) and speedups are reported in Table 
3. Of the benchmarks presented in Table 3, the only poor performance can be 
observed in the Color benchmark in the case of a ladder graph. We have included 
this benchmark (which is very fine grained) to underline the obvious fact that a 
distributed execution of a small benchmark with fine grained tasks is not viable. 
The performance obtained in the other benchmarks was more than satisfactory. 
The speedups are quite good and the exploitation of parallelism is effective. 



Shared vs. Distributed Memory 



Shared vs. Distributed Memory 




• • sjss Distrib. 

♦ sjss SHMem 
^ * T8 Distrib. 




Fig. 5. Speedup Comparison between Shared Memory and Distributed Execu- 
tion 



Figure 5 compares the speedups for the same executions on shared memory 
and on DMM machines. As we can see from these curves, the DMM implementa- 
tion does not loose much due to the higher communication costs. This is a very 
positive result, considering that these results have been obtained on the real-life 
applications used as benchmarks. It is also particularly interesting to observe the 
T8 benchmark: the speedups observed on the distributed engine are higher than 
those observed on a shared memory system, especially for large number of agents. 
Our intuition in this case is that the messages exchanged during execution in 
the shared memory implementation create a contention on the communication 
bus which instead does not appear in the case of the distributed memory engine. 



Name 


1 Agent 


2 Agents 


3 Agents 


4 Agents 


8 Agents 


Color (Ladder) 


3452 


2499 (1.38) 


2354 (1.47) 


2929 (1.18) 


2954 (1.17) 


Color (Random2) 


2067987 


1162905 (1.78) 


829685 (2.51) 


604586 (3.43) 


310622 (6.66) 


Logistics 2 


3937246 


2172124 (1.81) 


1842695 (2.14) 


1652869 (2.38) 


1041534 (3.78) 


Strategic 


76207 


40169 (1.90) 


28327 (2.69) 


21664 (3.52) 


12580 (6.06) 


sjss 


93347226 


46761140 (2.0) 


31012367 (3.01) 


22963465 (4.06) 


13297326 (7.02) 


T8 


1770106 


865175 (2.0) 


590035 (3.0) 


444730 (4.0) 


226930 (7.8) 



Table 3. Execution Times (in /is.) and Speedups on Bewoulf 
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8 Conclusions and Future Work 

The problem tackled in this paper is the efficient execution of Answer Set Pro- 
grams. Answer Set Programming is an emerging programming paradigm which 
has its roots in logic programming, non-monotonic reasoning, and constraint 
programming. This blend has lead to a paradigm which provides for very declar- 
ative programs (more declarative, e.g., then traditional Prolog programs). ASP 
has been proved to be very effective for specific application areas, such as plan- 
ning and design of common-sense reasoning engines for intelligent agents. The 
goal of this work is to explore the use of parallelism to improve execution perfor- 
mance of ASP engines. We have determined two forms of parallelism which can 
be suitably exploited from a constraint-based ASP engine. We have focused on 
the exploitation of one of these two forms of parallelism (that we have called ver- 
tical parallelism) and presented an efficient parallel engine based on these ideas. 
Alternative approaches to perform work sharing and scheduling have been ana- 
lyzed and studied in our implementation. Performance results for our prototype 
running on shared and distributed memory machines have been presented. 

The project is currently focusing on integrating the second form of paral- 
lelism, horizontal parallelism, in the existing engine. It is clear from our experi- 
ments that a number of applications will take considerable advantage from the 
exploitation of this alternative form of parallelism. This step requires careful 
thinking, especially if we desire to allow both forms of parallelism to efficiently 
coexist within the same engine. 
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Abstract. Functional programming languages are not generally associ- 
ated with computationally intensive tasks such as computer vision. We 
show that a declarative programming language like Haskell is effective 
for describing complex visual tracking systems. We have taken an ex- 
isting C-|— I- library for computer vision, called XVision, and used it to 
build FVision (pronounced “fission” ) , a library of Haskell types and func- 
tions that provides a high-level interface to the lower-level XVision code. 
Using functional abstractions, users of FVision can build and test new 
visual tracking systems rapidly and reliably. The use of Haskell does not 
degrade system performance: computations are dominated by low-level 
calculations expressed in C+-l- while the Haskell “glue code” has a neg- 
ligible impact on performance. 

FVision is built using functional reactive programming (FRP) to ex- 
press interaction in a purely functional manner. The resulting system 
demonstrates the viability of mixed-language programming: visual track- 
ing programs continue to spend most of their time executing low-level 
image-processing code, while Haskell’s advanced features allow us to de- 
velop and test systems quickly and with confidence. In this paper, we 
demonstrate the use of Haskell and FRP to express many basic abstrac- 
tions of visual tracking. 



1 Introduction 

Algorithms for processing dynamic imagery — video streams composed of a 
sequence of images — have reached a point where they can now be usefully em- 
ployed in many applications. Prime examples include vision-driven animation, 
human-computer interfaces, and vision-guided robotic systems. However, despite 
rapid progress on the technological and scientific fronts, the fact is that software 
systems which incorporate vision algorithms are often quite difficult to develop 
and maintain. This is not for lack of computing power or underlying algorithms. 
Rather, it has to do with problems of scaling simple algorithms to address com- 
plex problems, prototyping and evaluating experimental systems, and effective 
integration of separate, complex, components into a working application. 
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There have been several recent attempts to build general-purpose image pro- 
cessing libraries, for example [9, 13, 8]. In particular, the Intel Vision Libraries[7] 
is an example of a significant software effort aimed at creating a general-purpose 
library of computer vision algorithms. Most of these efforts have taken the tradi- 
tional approach of building object or subroutine libraries within languages such 
as C-|— I- or Java. While these libraries have well designed interfaces and contain a 
large selection of vision data structures and algorithms, they tend not to provide 
language abstractions that facilitate dynamic vision. 

The research discussed in this paper started with X Vision, a large library 
of Gf- 1- code for visual tracking. XVision was designed using traditional object- 
oriented techniques. Although computationally efficient and engineered from the 
start for dynamic vision, the abstractions in XVision often failed to solve many 
basic software engineering problems. In particular, the original XVision often 
lacked the abstraction mechanisms necessary to integrate primitive vision com- 
ponents into larger systems, and it did not make it easy to parameterize vision 
algorithms in a way that promoted software reusability. 

Rather than directly attacking these issues in the C-|— I- world, we chose a 
different approach: namely, using declarative programming techniques. FVision 
is the result of our effort, a Haskell library that provides high-level abstractions 
for building complex visual trackers from the efficient low-level C-|— I- code found 
in XVision. The resulting system combines the overall efficiency of Gf- 1- with 
the software engineering advantages of functional languages: flexibility, compos- 
ability, modularity, abstraction, and safety. 

This paper is organized as a short tour of our problem domain, punctuated 
by short examples of how to construct and use FVision abstractions. To put 
visual tracking into a more realistic context, some of our examples include ani- 
mation code implemented in Fran, an animation system built using FRP[1]. Our 
primary goal is to explore issues of composibility: we will avoid discussion of the 
underlying primitive tracking algorithms and focus on methods for transform- 
ing and combining these primitive trackers. All of our examples are written in 
Haskell; we assume the reader is familiar with the basics of this language. See 
haskell . org for more further information about Haskell and functional pro- 
gramming. FRP is a library of types and functions written Haskell. The FRP 
library has been evolving rapidly; some function and type names used here may 
not match those in prior or future papers involving FRP. We do not assume prior 
experience with FRP in this paper. Further information regarding FRP can be 
found at haskell.org/frp. 



2 Visual Tracking 

Tracking is the inverse of animation. That is, animation maps a scene description 
onto a (much larger) array of pixels, while tracking maps the image onto a much 
simpler scene description. Animation is computationally more efficient when the 
scene changes only slightly from one frame to the next: instead of re-rendering the 
entire scene, a clever algorithm can reuse information from the previous frame 
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and limit the amount of new rendering needed. Tracking works in a similar way: 
computationally efficient trackers exploit the fact that the scene changes only 
slightly from one frame to the next. 

Consider an animation of two cubes moving under 3D transformations tl 
and t2. These transformations translate, scale, and rotate the cube into location 
within the scene. In Fran, the following program plays this animation: 

scene : : TrEuisformSB -> TransformSB -> GeometryB 
scene tl t2 = cubel ‘unionG' cube2 

where cubel = unitCube ‘transformG' tl 
cube2 = unitCube ‘transformG' t2 

Rendering this animation is a process of generating a video (i.e. image stream) 
that is a composition of the videos of each cube, each of those in turn constructed 
from the individual transformations tl and t2. In computer vision we process 
the image stream to determine location and orientation of the two cubes, thus 
recovering the transformation parameters tl and t2. 

We accomplish this task by using knowledge of the scene structure, as cap- 
tured in a model, to combine visual tracking primitives and motion constraints 
into an “observer.” This observer processes the video input stream to determine 
the motion of the model. We assume that the behavior of objects in the video is 
somehow “smooth”: that is, objects do not jump suddenly to different locations 
in the scene. There are also a number of significant differences between vision 
and animation: 

— Tracking is fundamentally uncertain: a feature is recognized with some mea- 
surable error. These error values can be used resolve conflicts between track- 
ers: trackers that express certainty can “nudge” other less certain trackers 
toward their target. 

— Efficient trackers are fundamentally history sensitive, carrying information 
from frame to frame. Animators generally hide this sort of optimization from 
the user. 

— Animation builds a scene “top down” : complex objects are decomposed un- 
ambiguously into simpler objects. A tracker must proceed “bottom up” from 
basic features into a more complex object, a process which is far more open 
to ambiguity. 

The entire XVision system consists of approximately 27,000 lines of C-|— I- 
code. It includes generic interfaces to hardware components (video sources and 
displays), a large set of image processing tools, and a generic notion of a “track- 
able feature.” Using this as a basis, XVision also defines several trackers: spe- 
cialized modules that recognize and follow specific features in the video image. 
XVision includes trackers for features such as edges, corners, reference images, 
and areas of known color. These basic tracking algorithms were re-expressed in 
Haskell using basic C-|— I- functions imported via GreenCard, a tool for importing 
C code into Haskell. 
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2.1 Primitive Trackers 

Primitive trackers usually maintain an underlying state. This state defines the 
location of the feature as well as additional status information such as a confi- 
dence measure. The form of the location is specific to each sort of tracker. For a 
color blob it is the area and center; for a line it is the two endpoints. 
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Fig. 1. Figure of XVision feedback loop. 



Figure 1 illustrates this idea conceptually for the specific case of an SSD (Sum 
of Squared Differences) tracking algorithm [2]. This algorithm tracks a region 
by attempting to compute an image motion and/or deformation to match the 
current appearance of a target to a fixed reference. The steps in the algorithm 
are: 

1 . Acquire an image region from the video input using the most recent estimate 
of target position and/or configuration. In addition, reverse transform (warp) 
it. The acquired region of interest is generally much smaller than the full 
video frame. Pixels are possibly interpolated during warping to account for 
rotation or stretching. 

2. Compute the difference between this image and the reference image (the 
target). 

3. Determine what perturbation to the current state parameters would cause 
the (transformed) current image to best match the reference. 

4. Use this data to update the running state. 
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As this process requires only a small part of the original video frame it is very 
efficient compared to techniques that search an entire image. It also makes use 
of the fact that motion from frame to frame is small when computing the pertur- 
bation to the current state. As a consequence, it requires that the target move 
relatively consistently between frames in the image stream: an abrupt movement 
may cause the tracker to lose its target. 

In XVision, trackers can be assembled into hierarchical constraint networks 
defined by geometric knowledge of the object being tracked (the model). This 
knowledge is typically a relationship between different points or edges in the 
object’s image, such as the four corners of a square. If one corner is missing in 
the image (perhaps due to occlusion) then the positions of the other three define 
the expected location of the missing corner. This allows a disoriented tracker 
to resynchronize with its target. Although XVision includes object-oriented ab- 
stractions for the construction of hierarchical constraint networks, these abstrac- 
tions had proven difficult to implement and limited in expressiveness. In the 
remainder of the paper we describe a rich set of abstractions for tracker compo- 
sition. 



3 Abstractions for Visual Tracking 

A camera converts a continuously changing scene into a discrete stream of im- 
ages. In previous work we have defined trackers in terms of standard stream pro- 
cessing combinators [II]. Here these combinators are subsumed by FRP. FRP 
supports inter-operation between continuous time systems and discrete time 
(stream processing) systems. This allows FVision to combine with animation 
systems such as Fran or robotics systems such as Frob[10j. 

Before examining the construction of trackers, we start by demonstrating the 
use of a tracker in conjunction with animation. This function processes a video 
stream, of type Video elk, and generates an animation in which a red dot is 
drawn over a tracked image. The type Video is defined thusly: 

type Video elk = CEvent elk Image 

The CEvent elk a in FRP denotes a stream of values, each of type a, synchro- 
nized to clock elk. This clock type allows FVision to detect unintentional clock 
mismatches. Since none of the code in this paper is tied to a specific clock the 
elk argument to CEvent will always be uninstantiated. 

The user must first define the reference image to be tracked by using the 
mouse to select a rectangular area around the target in the video image. The 
rectangular area is marked by pressing the mouse to indicate the top-left corner, 
then dragging and releasing the mouse at the bottom-right corner. As the mouse 
is being dragged, an animated rectangle is drawn over the video image. Once the 
mouse is released, the rectangle is replaced by a red dot centered on the rectangle 
and an SSD tracker is created to move the dot through successive frames. 

followMe : : Video elk -> PictureB 
followMe video = 
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videoB ‘untilB' 

(Ibp ‘snapshot_‘ mouse) ==> 

\cornerl -> rectauigle (liftO cornerl) mouse ‘over* videoB 
‘untilB' 

((Ibr ‘snapshot_‘ (pairB mouse videoB) ==> 

\(corner2, image) -> 

let tracker = ssdTracker (getimage image cornerl corner2) 
mid = midpoint cornerl corner2 
b = runTrackerB videoB mid mid tracker 
in (redDot ‘transf orm2B‘ tracker) 

‘over' videoB )) 

where videoB = stepper nulllmage video — convert image stream to behavior 
redDot = . . . — draw a red dot 

rectcuigle cl c2 = ... — draw a rectangle 

The above code can be read: “Behave as the video input until the left mouse 
button is pressed, at which time a snapshot of the mouse position is taken. Then 
draw a rectangle whose top left-hand corner is fixed but whose bottom right- 
hand corner is whatever the current mouse position is. Do this until the left 
mouse button is released, at which point a snapshot of both the mouse position 
and the video are taken. A tracker is initialized with the midpoint of the two 
corners as the initial location and the snapshot image as the reference. Use the 
output of the tracker to control the position of a red dot drawn over the video 
image.” For example, if you draw a rectangle around a face the tracker can the 
follow this face as it moves around in the camera’s field of view. This tracker is 
not robust: it may lose the face, at which point the red dot will cease to move 
meaningfully. 

Functions such as untilB and snapshot_ are part of FRP. By convention, 
types and functions suffixed with “B” deal with behaviors: objects that vary con- 
tinuously with time. Type synonyms are used to abbreviate Behavior Picture 
as PictureB. Some functions are imported from XVision: the getimage func- 
tion extracts a rectangular sub-image from the video stream. This image serves 
as a reference image for the SSD (Sum Squared Difference) tracker. Once the 
reference image is acquired, the tracker (the ssdTracker function) defines a 
behavior that follows the location of the reference image in the video stream. 
The runTrackerB function starts the tracker, pointing it initially to the selected 
rectangle, defining the transformation used in the animation. 



3.1 Types for Tracking 

The goal of this research is to define trackers in a compositional style. Following 
the principals of type directed design, we start with some type definitions. A 
tracker is composed of two parts: an observer which acquires and normalizes 
some subsection of the video image, and a stepper which examines this sub- 
image and computes the motion of the tracker. 

type Observer observation a = (a, Image) -> observation 

type Stepper measure observation a = (a, observation) -> measure a 
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The observer takes the present location, a, of the tracker and the current frame 
of video and returns an observation: usually one or more sub-images of the frame. 
The location may be designated using a single point (the Point2 type), as used 
in color blob tracking, or perhaps by a point, rotation, and scale (represented by 
the Transform2 type). Observers may also choose sample at lower resolutions, 
dropping every other pixel for example. In any case, the type a is determined by 
the particular observer used. 

The stepper adjusts the location of the tracked feature based on the current 
location and the observation returned by the stepper. The stepper may also com- 
pute additional values that measure accuracy or other properties of the tracker. 
We choose to make the measure a type constructor, measure a, rather than a 
separate value, (measure , a) , so as to use overloading to combine measured 
values. XVision defines a variety of steppers, including the SSD stepper, color 
blob steppers, edge detectors, and motion detectors. 

Measurement types are defined to be instances of the Valued class. This 
extracts the value from its containing measurement type: 

class Valued c where 
valueOf : : c a -> a 

Measurement types are also in the Functor class, allowing modification the 
contained value. 

The Residual type used by the SSD tracker in an example of a measurement: 
data Residual a = 

Residual { a :: resValue, residual :: Float } 

instance Valued Residual where 
value Of = resValue 

Combining an observer and a stepper yields a tracker: a mapping from a 
video stream onto a stream of measured locations. 

type Tracker measure a = Stepper measure Image a 

Note that Tracker is a refinement of the Stepper type. Trackers are constructed 
by combining an observer with a stepper: 

mkTracker : : Observer observation a -> Stepper measure observation a -> 

Tracker measure a 

mkTracker o s = \(loc, image) -> let ob = o (loc, image) in s (loc, ob) 



3.2 A Primitive Tracker 

We can now assemble a primitive FVision tracker, the SSD tracker. Given a 
reference image, the observer pulls in a similar sized image from the video source 
at the current location. The stepper then compares the image from the current 
frame with the reference, returning a new location and a residual. This particular 
tracker uses a very simple location: a 2-D point and an orientation. The SSD 
observer is an XVision primitive: 
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grabTransf orm2 : : Size -> Observer Image Transform2 

where Size is a type defining the rectangular image size (in pixels) of the ref- 
erence image. The position and orientation of the designated area, as defined 
in the Transform2, are used to interpolate pixels from the video frame into an 
image of the correct size. 

The other component of SSD is the stepper: a function that compares a 
reference image with the observed and determines the new location of the image. 
The type of the stepper is 

ssdStep : : Image -> Stepper Residual Image Tr£uisform2 

where Image argument is the reference image. A detailed description of this 
particular stepper is found in [11]. Now for the full SSD tracker: 

ssdTracker : : Image -> STracker Residual Transform2 
ssdTracker image = 

mkTracker (grabTransf orm2 (sizeOf image)) (ssdStep image) 

Before we can use a tracker, we need a function that binds a tracker to a video 
source and initial location: 

runTracker : : Valued measure => 

Video elk -> a -> Tracker measure a -> CEvent elk a 
runTracker video aO tracker = ma where 
locations = delay aO aStream 
ma = Iift2 (,) locations video ==> tracker 

aStream = ma ==> valueOf 

The delay function delays the values of an event stream by one clock cycle, 
returning an initial value, here aO, on the first clock tick. 

We can also run a tracker to create a continuous behavior, Behavior b. 

runTrackerB : : Valued measure => 

Video elk -> measure a -> Tracker measure a -> CEvent elk a 
runTrackerB video maO trk = 

stepper maO (runTracker video (valueOf maO) trk) 

In this function, we need a measured initial state rather than an unmeasured 
one since the initial value of the behavior is measured. 

The elk in the type of runTracker is not of use in these small examples 
but is essential to the integrity of multi-rate systems. For example, consider an 
animation driven by two separate video sources: 

scene : : Video clkl -> Video clk2 -> PictureB 

The type system ensures that the synchronous parts of the system, trackers 
clocked by either of the video sources, are used consistently: no synchronous 
operation may combine streams with different clock rates. By converting the 
clocked streams to behaviors, we can use both video sources to drive the resulting 
animation. 
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3.3 More Complex Trackers 

Consider an animator that switches between two different images: 
scene : : Tremsf orm2B -> BoolB -> PictureB 

scene place which = tr£msform2 place (ifB which picturel picture2) 

A tracker for this scene must recover both the location of the picture, place (a 
2-D transformation) and the boolean that selects the picture, which. Previously, 
we inverted the transformation for a fixed picture. Here we must also invert the 
ifB function to determine the state of the boolean. We also we wish to retain the 
same compositional program style used by the animator: our tracking function 
should have a structure similar to this scene function. 

The composite tracker must watch for both images, picturel and picture2 
at all times. To determine which image is present, we examine the residual 
produced by SSD, a measure of the overall difference between the tracked image 
and the reference image. We formalize this notion of “best match” using the Ord 
class: 

instance Ord (Residual a) where 

rl > r2 = residual rl < residual r2 

This states that smaller residuals are better than large ones. 

The bestOf function combines a pair of trackers into a tracker that follows 
whichever produces a better measure. The trackers share a common location: in 
the original scene description, there is only one transformation even though there 
are two pictures. The resulting values are augmented by a boolean indicating 
which of the two underlying trackers is best correlated with the present image. 
This value corresponds to the which of the animator. The projection of the 
measured values onto the tracker output type are ignored: this combines the 
internal tracker states instead of the observed values seen from outside. 

bestOf :: (Functor measure, Ord measure) => 

Tracker measure a -> Tracker measure a -> Tracker measure (a, Bool) 
bestOf tl t2 = 

\((loc, _) , v) -> max (fmap (\x -> (x, True)) (tl (loc, v))) 

(fmap (\x -> (x, False)) (t2 (loc, v))) 

The structure of bestOf is simple: the location (minus the additional boolean) 
is passed to both tracker functions. The results are combined using max. The 
fmap functions are used to tag the locations, exposing which of the two images 
is presently on target. 

This same code can be used on steppers as well as trackers; only the signature 
restricts bestOf to use trackers. This signature is also valid: 

bestOf :: (Functor measure, Ord measure) => 

Stepper measure observation a -> Stepper measure observation a -> 
Stepper measure observation (a, Bool) 

Thus steppers are composable in the same manner as trackers. This is quite 
useful: by composing steppers rather than trackers we perform one observation 
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instead of two. Thus the user can define a more efficient tracker when combining 
trackers with a common observation. 

Higher-order functions are a natural way to express this sort of abstraction 
in FVision. In C-|— I- this sort of abstraction is more cumbersome: closures (used 
to hold partially applied functions) must be defined and built manually. 

3.4 Adding Prediction 

We may to improve tracking accuracy by incorporating better location prediction 
into the system. When tracking a moving object we can use a linear approxima- 
tion of motion to more accurately predict object position in the next frame. A 
prediction function has this general form: 

type Predictor a = Behavior (Time -> a) 

That is, at a time t the predictor defines a function on times greater than t based 
on observations occurring before t. 

Adding a predictor to runTr acker is simple: 

ruuTrackerPred : : Valued measure => 

Video elk -> Tracker measure a -> Predictor a -> CEvent elk a 
ruuTrackerPred video tracker p = 
withTimeE video ‘snapshot* p 

==> \((v,t), predictor) -> tracker (predictor t, v) 

The FRP primitive withTimeE adds an explicit time to each frame of the video. 
Then snapshot, another FRP primitive, samples the predictor at the current 
time and adds sampled values of the prediction function to the stream. 

This is quite different from runTr acker; there seems to be no connection 
from output of the tracker back to the input for the next step. The feedback 
loop is now outside the tracker, expressed by the predictor. 

Using prediction, a tracking system looks like this: 

follow Image : : Video elk -> Image -> Point2 -> CEvent elk Point2 
follow Image video i pO = 
let ssd = ssdTracker i 

p = interp2 pO positions 

positions = runTrackerPred video p ssd 

interp2 : : Point2 -> CEvent elk Point2 -> Predictor Point2 

The interp2 function implements simple linear prediction. The first argu- 
ment is the initial prediction seen before the initial interpolation point arrives. 
This initial value allows the pO passed to interp2 to serve as the initial observed 
location. 



4 Generalized Composite Trackers 

We have already demonstrated one way to compose trackers: bestOf . Here, we 
explore a number of more general compositions. 
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4.1 Trackers in Parallel 

An object in an animation may contain many trackable features. These features 
do not move independently: their locations are related to each other in some 
way. Consider the following function for animating a square: 

scene : : Transform2B -> PictureB 

scene t = transform2 t (polygon [(0,0), (0,1), (1,1), (1,0)]) 

In the resulting animation, trackers can discern four different line segments - 
one for each edge of the square. The positions of these line segments are some- 
what correlated: opposite edges remain in parallel after transformation. Thus we 
have a level of redundancy in the trackable features. Our goal is to exploit this 
redundancy to make our tracking system more robust by utilizing relationships 
among tracked objects. 

A composite tracker combines trackers for individual object features into 
a tracker for the overall object. The relationship between the object and its 
features is represented using a pair of functions: a projection and an embedding. 
These functions map between the model state (parameters defining the overall 
object) and the states of the component trackers. The projection function maps a 
model state onto a set of component states and the embedding function combines 
the component states into a model state. This function pair is denoted by the 
following type: 

type EPair a b = (a -> b, b -> a) 

We now build a composite tracker that combines the states of two component 
trackers. In this example, we define a corner tracker using two component edge 
trackers. Edge trackers are implemented using the following XVision stepper: 

edgeStepper : : Stepper Sharpness Image LineSeg 

The location maintained by the tracker is a line segment, denoted by the LineSeg 
type. This tracker observes an Image and measures the quality of tracking 
with the Sharpness type. This Sharpness type has the same structure as the 
Residual type but is mathematically distinct. To combine two line segments 
into a corner, we find the intersection of the underlying lines (possibly outside 
the line segments) and then “nudge” the line segment to this point. This is cru- 
cial since the edge trackers tend to creep away from the corner. The underlying 
geometric types are as follows: 

type LineSeg = (Point2, Vector2) 

type Corner = (Point2, Vector2, Vector2) 

We force the length of the vector defining a line segment to remain constant 
during tracking, allowing the use of a fixed size window on the underlying video 
stream. The projection and embedding functions are thus: 

cornerToSegs : : Corner -> (LineSeg, LineSeg) 

cornerToSegs (corner, vl, v2) = ((corner, vl) , (corner, v2)) 

segsToCorner : : (LineSeg, LineSeg) -> Corner 

segsToCorner (segl@(_,vl) , seg2@(,v2)) = (segintersect segl seg2, vl, v2) 
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Next we need a function to combine two trackers using a projection / embedding 
pair. 

join2 :: (Joinable measure, Functor Measure) => 

Tracker measure a-> Tracker measure b-> EPair(a,b) c-> Tracker measure c 
join2 tl t2 (fromTup, toTup) = 

\(c, v) -> let (a,b) = toTup c 
ma = tl (a,v) 
mb = t2 (b,v) 

in fmap fromTup (joinTup2 (ma, mb)) 

The structure of this function is the same as the bestOf function defined earlier. 
There is a significant addition though: the type class Joinable. Here we create a 
measured object from more than one measured sub-objects. Thus we must com- 
bine the measurements of the sub-objects to produce an overall measurement. 
The Joinable class captures this idea: 

class Joinable 1 where 
joinTup2 :: (1 a,l b) -> 1 (a, b) 

joinTupS : : (1 a,l b,l c) -> 1 (a, b, c) — and so on 
instance Joinable Sharpness where . . . 

The joinTup2 function joins two measured values into a single one, combining 
the measurements in some appropriate way. Joining measurements in a system- 
atic manner is difficult; we will avoid addressing this problem and omit instances 
of Joinable. 

Another way to implement joining is to allow the embedding function to see 
the underlying measurements and return a potentially different sort of measure- 
ment: 

join2m : : Tracker measure a -> Tracker measure b -> 

((measure a, measure b) -> measure2 c, c -> (a, b) ) -> 

Tracker measure2 c 

This can be further generalized to allow all of the component trackers to use 
different measurements. However, in most cases we can hide the details of joining 
measured values within a type class and spare the user this extra complexity. 
Now for the corner tracker: 

trackCorner : : Tracker Sharpness LineSeg -> Tracker Sharpness LineSeg -> 
Tracker Sharpness Corner 

trackCorner 11 12 = join2 11 12 (segsToCorner , cornerToSegs) 

The j oin2 function is part of a family of joining functions, each integrating some 
specific number of underlying trackers. 

The corner tracker incorporates “crosstalk” between the states of two trackers 
but does not have to deal with redundant information. We now return to tracking 
a transformed square. Given four of these corner trackers, we now compose them 
into a square tracker. The underlying datatype for a square is similar to the 
corner: 



type Square = (Point2, Point2, Point2) 
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We need specify only three points; the fourth is functionally dependent on 
the other three. This type defines the image of a square under affine transforma- 
tion: from this image we can reconstruct the transformation (location, rotation, 
scaling, and shear). Our problem now is to map four tracked corners onto the 
three points defining the Square type. There are many possibilities: for exam- 
ple, we could throw out the point whose edges (the two vectors associated with 
the corner) point the least towards the other corners, probably indicating that 
the corner tracker is lost. Here, we present a strategy based on the Sharpness 
measure coming from the underlying trackers. 

The only significant difference between the previous example and this one is in 
the embedding function. We need to combine measured values in the embedding; 
thus the tracker is defined using join4m. First, we need to lift a function into the 
domain of measured values. Using the Joinable class we define the following: 

jLiftS : : Joinable m => (a -> b -> c -> d) -> (m a -> m b -> m c -> m d) 
j Lifts f = \x y z -> let t = joinTupS x y z in 

fmap (\(x’,y’,z’) -> f x’ y’ z’) t 

Using this, we build a function that generates a measured square from three 
measured points: 

mkSquare : : Sharpness Point2 -> Sharpness Point2 -> Sharpness Point2) -> 
Sharpness Square 

mkSquare = jLiftS (\x y z -> (x,y,z)) 

Now we generate all possible squares defined by the corners, each using three 
of the four edge points, and choose the one with the best Sharpness measure 
using max: 

bestSquare : : (Sharpness Point2, Sharpness Point2, Sharpness Point2, 
Sharpness Point2) -> Sharpness Square 
bestSquare (vl, v2, v3, v4) = 

mkSquare vl v2 v3 ‘max' mkSquare vl v2 v4 ‘max' 
mkSquare vl v3 v4 ‘max' mkSquare v2 v3 v4 

In summary, the family of join functions capture the basic structure of the 
parallel tracker composition. While this strategy occasionally requires somewhat 
complex embedding functions this is exactly where the underlying domain is also 
complex. Also, we can use overloading to express simple embedding strategies 
in a concise and readable way. 



4.2 Trackers in Series 

Another basic strategy for combining trackers is combine slow but robust “wide 
field” trackers with fast but fragile “narrow-field” trackers to yield an efficient ro- 
bust tracking network. The structure of this type of tracker does not correspond 
to an animator since this deals with performance rather than expressiveness. 
Switching between different trackers is governed by measures that determine 
whether the tracker is “on feature” or not. Consider the following three trackers: 




FVision: A Declarative Language for Visual Tracking 317 



— A motion detector that locates areas of motion in the full frame. 

— A color blob tracker that follows regions of similarly colored pixels. 

— A SSD tracker targeted at a specific image. 

Our goal is to combine these trackers to follow a specific face with an unknown 
initial location. The motion detector finds an area of movement. In this area, the 
blob tracker finds a group of flesh-colored pixels. Finally, this blob is matched 
against the reference image. Each of these trackers suppresses the one immedi- 
ately proceeding it: if the SSD tracker is “on feature” there is no need for the 
other trackers to run. 

The type signatures of these trackers are relatively simple: 
motionDetect : : Tracker SizeAndPlace () 

blob : : Color -> Tracker SizedAndOriented Point2 

ssd : : Image -> Tracker Residual Transform2 

The motionDetect tracker is an example of a stateless tracker. That is, it does 
not carry information from frame to frame. Instead, it looks at the entire frame 
(actually a sparse covering of the entire frame) at each time step. Since there 
is no location to feed to the next step, all of the information coming out of 
motionDetect is in the measure. For the blob tracker we get both a size and an 
orientation, the axis that minimizes distance to the points. 

To compose trackers in series, we use a pair of state projection functions. 
This is similar to the embedding pairs used earlier except that there is an extra 
Maybe in the types: 

type SProjection ml al m2 a2 = (ml al -> Maybe s2, m2 a2 -> Maybe si) 

These functions lead up and down a ladder of trackers. At every step, if in the 
lower state we go “up” if the current tracker can produce an acceptable state 
for the next higher tracker. If we are in the higher state, we drop down if the 
current tracker is not in a suitable situation. 

The tracker types reflect the union of the underlying tracker set. To handle 
measures, we need a higher order version of Either: 

data Either! tl t2 a = Left! (tl a) I Right! (t2 a) 

instance (Valued tl. Valued t2) => Valued (Either! tl t2) where 
valueOf (Left! x) = valueOf x 
valueOf (Right! x) = valueOf x 

Now we combine two trackers in series: 

tower : : Tracker ml al -> Tracker m2 a2 -> SProjection ml al m2 a2 -> 
Tracker (Either! ml m2) (Either al a2) 
tower low high (up, down) = 

\(a, v) -> case a of 

Left al -> let mal = low (al, v) in 
case up mal of 

Nothing -> Left! (fmap Left mal) 
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Just a2 -> 

let ma2 = high (a2, v) in 
case down ma2 of 

Nothing -> Right! (fmap Right ma2) 

Just _ -> Left! (fmap Left mal) 

Right a2 -> let ma2 = high (a2, v) in 
case down ma2 of 

Nothing -> Right! (fmap Right ma2) 

Just al -> Left! (fmap Left (low mal)) 

This calls each of the sub-trackers no more than once per time step. The 
invariants here are that we always attempt to climb higher if in the lower state 
and that we never return a value in the higher state if the down function rejects 
it. 

Before using the tower function, we must construct the state projections. 
Without showing actual code, they function as follows: 

— Move from motionDetect to blob whenever the size of the area in motion 
is greater than some threshold (normally set fairly small). Use the center of 
the area in motion at the initial state in the blob tracker. 

— Always try to move from blob to SSD. Use the blob size and orientation to 
create the initial transformation for the SSD tracker state. 

— Drop from ssd to blob when the residual is greater than some threshold. 
Use the position in the transformation an the initial state for blob. 

— Drop from blob to motionDetect when the group of flesh-toned pixels is 
too small. 

The composite tracker has the following structure: 

face!rack : : Image -> 

!racker (Either! (Either! SizeAndPlace SizedAndOriented) Residual) 
(Either (Either () Point2) !ransform2) 
face!rack image = 

tower (tower motionDetect blob (upFromMD, downFromBlob) ) 
ssd onlyRight (upFromBlob, downFromSSD) 

where 

upFromMD mt = 

if mArea mt > md!hreshold then Just mCenter mt else Nothing 
downFromBlob mt = 

if blobSize mt < b!hreshold then Just (blobCenter mt) else Nothing 
upFromBlob mt = 

Just (translate2 (blobCenter mt) 

‘ compose2 ‘ 

rotate2 (blobOrientation mt)) 
downFromSSD mt = 

if residual mt > ssdthreshold 

then Just (origin2 ‘transform2‘ (valueOf mt)) 
else Nothing 

The onlyRight function is needed because the composition of motion detection 
and blob tracking yields an Either type instead of a blob type. The onlyRight 
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function (not shown) keeps the ssd tracker from pulling on the underlying tracker 
when it is looking for motion rather than at a blob. 

The output of this tracker would normally be filtered to remove states from 
the “backup” trackers. That is, the ultimate result of this tracker would probably 
be Behavior (Maybe Point2) rather that Behavior Point2. Thus when the 
composite tracker is hunting for a face rather than on the face this will be 
reflected in the output of the tracker. 



5 Performance 

Programs written in FVision tend to run at least 90% as fast as the native 
C++ code, even though they are being run by a Haskell interpreter. This can be 
attributed to the fact that the bottleneck in vision processing programs is not in 
the high-level algorithms, as implemented in Haskell, but in the low-level image 
processing algorithms written in C-H-. As a result, we have found that FVision 
is a realistic alternative to G++ for prototyping or even delivering applications. 
While there are, no doubt, situations in which the performance of Haskell code 
may require migration to C-H- for efficiency, it is often the case that the use of a 
declarative language to express high-level organization of a vision system has no 
appreciable impact on performance. Furthermore, the Haskell interpreter used 
in our experiment, Hugs, has a very small footprint and can be included in an 
application without seriously increasing the overall size of vision library. 



6 Related Work 

We are not aware of any other efforts to create a declarative language for com- 
puter vision, although there does exist a DSL for writing video device drivers 
[12] which is at a lower level than that this work. 

There are many on tools for building domain-specific languages such as FVi- 
sion from scratch, but most relevant are previous efforts of our own on embedded 
DSL’s [5, 4] that use an existing declarative language as the basic framework. 
General discussions of the advantages of programming with pure functions are 
also quite numerous; two of particular relevance to our work are one using func- 
tional languages for rapid prototyping [3] and one that describes the power of 
higher-order functions and lazy evaluation as the “glue” needed for modular 
programming [6] . 



7 Conclusions 

FVision has proven to be a powerful software engineering tool that increases 
productivity and flexibility in the design of systems using visual tracking. As 
compared to XVision, the original C-H- library, FVision reveals the essential 
structure of tracking algorithms much more clearly. 
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Some of the lessons learned in this project include: 

1. Visual tracking offers fertile ground for the deployment of declarative pro- 
gramming technology. The underlying problems are so difficult that the pay- 
off in this domain is very high. FVision is significantly better for prototyping 
tracking-based applications than the original XVision system. 

2. The process creating FVision uncovered interesting insights that were not 
previously apparent even to original XVision developers. Working from the 
“bottom up” to develop a new language forces the domain specialists to 
examine (or re-examine) the underlying domain for the right abstractions 
and interfaces. 

3. The principal features of Haskell, a rich polymorphic type system and higher- 
order functions, were a significant advantage in FVision. 

4. FRP provides a rich framework for inter-operation among the various system 
components. By casting trackers in terms of behaviors and events we were 
able to integrate them smoothly into other systems. 

This work was supported by NSF grant CCR-9706747 in experimental soft- 
ware systems. 
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Abstract. In the context of direct and reflection based extension mech- 
anisms for the Jinni 2000 Java based Prolog system, we discuss the design 
and the implementation of a reflection based Prolog to Java interface. 
While the presence of dynamic type information on both the Prolog and 
the Java sides allows us to automate data conversion between method 
parameters, the presence of subtyping and method overloading makes 
finding the most specific method corresponding to a Prolog call pattern 
fairly difficult. We describe a run-time algorithm which closely mimics 
Java’s own compile-time method dispatching mechanism and provides 
accurate handling of overloaded methods beyond the reflection package’s 
limitations. As an application of our interfacing technique, a complete 
GUI library is built in Prolog using only 10 lines of application specific 
Java code. 



Keywords: Java based Language Implementation, Prolog to Java Interface, 
Method Signatures, Dynamic Types, Method Overloading, Most Specific Method, 
Reflection 

1 Introduction 

In this paper, we discuss the extension of the Jinni 2000 Java based Prolog 
system [6, 8] with a reflection based generic Java Interface. After overviewing 
the Jinni architecture we describe the Prolog API (Application Programmer In- 
terface) used to invoke Java code and the mapping mechanism between Prolog 
terms and their Java representations. We next discuss the implementation of a 
reflection based Prolog-to-Java interface. We will overcome some key limitations 
of Java’s Reflection package (a Java API which accesses Java objects and classes 
dynamically). The main problem comes from the fact that reflection does its 
work at run time. Although the called classes have been compiled - the invoking 
code needs to determine a (possibly overloaded) method’s signature dynamically 
- something that Java itself does through fairly extensive compile time analysis. 
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First, we discuss the desired functionality provided by Java at compile time and 
explain it through a simple example. Subsequently, we provide the algorithm be- 
hind our implementation, which achieves at runtime the functionality that Java 
provides at compile time. We also show some nice properties of our algorithm 
such as low computational complexity. Finally we describe an example applica- 
tion (a GUI for Jinni) developed almost completely in Prolog using the reflection 
API. 

The Method Signature Problem Most modern languages support method over- 
loading (the practice of having more than one method with same name). In 
Java this also interacts with the possibility of having some methods located in 
super classes on the inheritance chain. On a call to an overloaded method, the 
resolution of which method is to be invoked is based on the method signature. 
Method signature is defined as the name of the method, its parameter types and 
its return type^. 

The problem initially seems simple: just look for the methods with the same 
name as call, number and type of parameters as the arguments in the call and 
pick that method. 

The actual problem arises because Java allows method invocation type con- 
version. In other words this means that we are not looking for an exact match 
in the type of a parameter and the corresponding argument, but we say it is a 
match if the type of argument can be converted to the type of a corresponding 
parameter by method invocation conversion [2]. Apparently, this also does not 
seem to be very complicated: we just check if the argument type converts to 
the corresponding parameter type or not. The problem arises because we may 
find several such matches and we have to search among these matches the most 
specific method - as Java does through compile time analysis. If such a method 
exists, then that is the one we invoke. However, should this search fail, an error 
has to be reported stating that no single method can be classified as the most 
specific method. 

This paper will propose a comprehensive solution to this problem, in the 
context of the automation of type conversions in Jinni 2000’s bidirectional Prolog 
to Java interface. 



2 The Jinni Architecture 

Jinni 2000 consists of a combination of fast WAM based Prolog engines using 
integer arrays for WAM data areas together with a Java class based term hier- 
archy - on which runs a lightweight Java based Interpreter, interoperating with 
the internal tagged integer based WAM representation, through an automated 
bidirectional data conversion mechanism. 

Jinni’s Java interface is more flexible and uses programmer- friendly Java 
class hierarchy of its interpreted Kernel Prolog [8] engines, instead of the high 

^ In resolving the method call Java ignores the return type. 
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performance but fairly complex integer based term representations of its WAM 
based BinProlog-style engines [9] . 

In this section we will explain the mapping from Prolog term types to Java 
classes. 



Term 





Fig. 1. Java Classes of Prolog Term Hierarchy 



2.1 The Term Hierarchy 

The base class is Term which has two subclasses: Var and NonVar. The Non- 
Var class is in turn extended by Num, JavaObject and Const. Num is ex- 
tended by Integer and Real. Term represents the generic Prolog term which is 
a finite tree with unification operation distributed across data types - in a truly 
object oriented style [6]. The Var class represents a Prolog variable. The Inte- 
ger and Real are the Prolog Numbers. Const represents all symbolic Prolog 
constants, with the compound term (called functor in Prolog) constructor class 
Fun designed as an extension of Const. 

JavaObject is also a subclass of Const which unifies only with itsel? and 
is used like a wrapper around Objects in Java to represent Prolog predicates. 



2.2 The Builtin Registration Mechanism 

Jinni’s Builtins class is a specialized subclass of Java’s Hashtable class. Every 
new component we add to Jinni 2000 can provide its own builtin predicates as a 
subclass of the Builtins class. Each added component will have many predicates, 
which are to be stored in this Hashtable mapping their Prolog representation 
to their Java code, for fast lookup. Let us assume the Prolog predicate’s name 

Modulo Java’s equals relation. 



2 
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is prologName and the corresponding Java classes name is javaName. We make 
a class called javaName which extends FunBuiltin (a descendant of Term with 
which we represent a Prolog functor (compound term). It accepts a string (the 
functor’s name) and an integer in its constructor (arity). When we call the regis- 
ter method of the appropriate descendant of the Builtins class, a new Hashtable 
entry is generated with the supplied prologName and the arity as key and ja- 
vaName as its value. Whenever the Prolog predicate prologName with appro- 
priate arity is called, we can look up in constant time which class (javaName) 
actually implements the exec method of the builtin predicate in Java. Each 
component extending the Builtins class will bring new such predicates and they 
will be added to the inherited Hashtable with the mechanism described above - 
and therefore will be usable as Prolog builtins as if they were part of the Jinni 
2000 kernel. 



2.3 The Builtin Execution Mechanism 

The descendents of the FunBuiltin class implement builtins which pass pa- 
rameters, while the descendents of the ConstBuiltin class implment parame- 
terless builtins. Both FunBuiltin and ConstBuiltin have an abstract method 
called exec to be be implemented by the descendent javaName class. This is 
the method that is actually mapped to the Prolog builtin predicate with pro- 
logName and gets invoked on execution of the predicate. The exec method im- 
plemented by the javaName class will get arguments (Term and its subclasses) 
from the predicate instance using getArg methods and will descover their dy- 
namic through a specialized method. Once we have the arguments and know 
their types we can do the required processing. The putArg method, used to 
return or check values, uses the unify method of Terms and its subclasses to 
communicate with the actual (possible variable) predicate arguments. On suc- 
cess this method returns 1. If putArg does not fail for any argument the exec 
method returns 1, which is interpreted as a success by Prolog. If at least one 
unification fails we return 0, which is interpreted as a failure by Prolog. We call 
this mapping a conventional builtin as this looks like a builtin from Prolog side, 
which is known at compile time and can be seen as part of the Prolog kernel. 



3 The Reflection Based Jinni 2000 Java Interface API 

Our reflection based Jinni 2000 Java Interface API is provided through a sur- 
prisingly small number of conventional Jinni builtins. This property is shared 
with the JIPL [3] interface from C-based Prologs to Java. The similarity comes 
ultimately from the fact that Java’s reflection package exhibits to Java the same 
view provided to C functions by JNI - the Java Native Interface: 



new _java_class(-|-’ClassName’, -Class). 
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This takes in as first argument the name of the class as a constant. If a class 
with that name is found it loads the class and a handle to this class is returned 
in the second argument wrapped inside our JavaObject. Now this handle can be 
used to instantiate objects. 

new_java_obj(+Class,-Obj):-new_java_obj (Class, new,Obj). 
new _java_ob j ( + C lass , + new (parameters ) ,- Ob j ) . 

This takes in as the first argument a Java class wrapped inside our JavaOb- 
ject. In the case of a constructor with parameters, the second argument con- 
sists of new and parameters (Prolog numeric or string constants or other ob- 
jects wrapped as JavaObjects) for the constructor. As with ordinary meth- 
ods, the (most specific constructor) corresponding to the argument types is 
searched and invoked. This returns a handle to the new object thus created 
again wrapped in JavaObject in the last argument of the predicate. If the second 
parameter is missing then the void constructor is invoked. The handle returned 
can be used to invoke methods: invoke_java_method( +Object, +method- 
name(parameters), -ReturnValue). 

This takes in as the first argument a Java class’s instantiated object (wrapped 
in JavaObject), and the method name with parameters (these can again be nu- 
merical or, string constants or objects wrapped as JavaObjects) in the second 
argument. If we find such an (accesible and unambigously most specific) method 
for the given object, then that method is invoked and the return value is put 
in the last argument. If the return value is a number or a string constant it 
is returned as a Prolog number or constant else it is returned wrapped as a 
JavaObject. 

If we wish to invoke static methods the first argument needs to be a class 
wrapped in JavaObject - otherwise the calling mechanis is the same 

The mapping of datatypes between Prolog and Java looks like this: 



Java 


Prolog 


int 




maybe (short, long) 


Integer 


double 




maybe (float) 


Real 


java.lang. String 


Const 


any other Object 


JavaObject is 
a bound variable, 
which unifies only 
with itself 



Table 1. Data Conversion 
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4 The Details of Implementation 

4.1 Creating a Class 

The reflection package uses the Java reflection API to load Java classes at run- 
time, instantiate their objects and invoke methods of both classes and objects. 
The Java Reflection Class. forName(”classname” ) method is used to create a 
class at runtime. In case an exception occurs, an error message stating the ex- 
ception is printed out and a 0 is returned, which is interpreted as a failure by 
Prolog. The error message printing can be switched on/off by using a flag. 

This is interfaced with Prolog using the conventional Builtin extension mech- 
anism getting the first argument passed as a Prolog constant seen by Java as a 
String. After this, the Java side processing is done and the handle to the required 
class is obtained. Finally this handle wrapped as a JavaObject is returned in the 
second argument. 

Example: 

new-java-class(’java. lang. String ’,S) 

Output: 

S=JavaObject(java.lang.Class-623467) 

4.2 lustautiatiug au Object 

First of all, the arguments of a constructor are converted into a list, then parsed 
in Prolog and provided to Java as JavaObjects. Then each one is extracted indi- 
vidually. If the parameter list is empty then a special token is passed instead of 
the JavaObject, which tells the program, that a void constructor is to be used 
to instantiate a new object from the class. This is done by invoking the given 
class’ newlnstance() method, which returns the required object. 

If the argument list is not empty, the class (dynamic type) of the objects 
on the argument list is determined using the getClass() method and stored in 
an array. This array is used to search the required constructor for the given 
class using the getConstructor(parameterTypes) method. Once the constructor 
is obtained, its newinstance (parameter List) method is invoked to obtain the 
required object. The exception mechanism is exactly the same as for creating a 
new class as explained above. 

This also uses the conventional Builtin extension mechanism to interface 
with Java, therefore Objects are wrapped as JavaObjects. Prolog Integers are 
mapped to Java iut and Prolog’s Real type becomes Java double. The reverse 
mapping from Java is slightly different as loug, int, short are mapped to Pro- 
log’s Int, which holds its data in a loug held and the float and double types 
are mapped to Prolog’s Real (which holds is data in adouble held). Java Strings 
are mapped to Prolog constants and vice versa (this is symmetric). 

Example: 

new-java-obj(S ,new(hello) jMystring) 
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Output: 

My String=JavaOhject(java.lang. String-924598) 

4.3 Invoking a Method 

The method invoking mechanism is very similar to the object instantiation mech- 
anism. The mapping of datatypes remains the same. The exception mechanism 
is also exactly same as that of constructing objects and classes. 

First we determine the class of the given object. The getConstructor method 
is replaced by getMethod(methodName, parameter Types) except that it takes 
in as the first argument a method name. Once the method is determined, its 
return type is determined using the getReturnType().getName() for the mapping 
of Prolog and Java datatypes following the convention described earlier. If the 
return type is void the value returned to Prolog will be the constant ’void’. To 
invoke the required method {the method we wish to invoke) we call the obtained 
method’s invoke. ( Object, parameterList) method and will return after conversion 
the return value for the given method. 

To invoke static methods, first we determine whether the object passed as the 
first argument is an instance of the class Class. If so, this is taken to be the class 
whose method is to be searched, and the call to invoke looks like invoke, (null, 
parameterList) 

Example 

invoke-java-method( My string, length, R ) 

Output: 

R=5 

Example 

invoke-java-method(Mystring,toString,NewR) 

Output: 

NewR=hello 

5 Limitations of Reflection 

An important limitation of the reflection mechanism is that when we are search- 
ing for a method or a constructor for a given class using the given parameter 
types. The reflection package looks for exact matches. That means if we have 
an object of class Sub and we pass it to a method, which accepts as argument 
an object of class Super, which is Sub’s super-class, we are able to invoke such 
a method in normal Java, but in case of reflection our search for such a method 
would fail and we would be unable to invoke this method. The same situation 
occurs in the case for an accepting an interface method, which actually means 
accepting all objects implementing the interface. The problem arises in the first 
place because method could be overloaded and Java decides, which method to 
call amongst overloaded methods at compile-time and not at runtime. We dis- 
cuss in the next section how the Java compiler decides, which method to call at 
compile time. 
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6 Java Compile Time Solution 

The steps involved in the determination of which method to call once we supply 
the object whose method is to be invoked and the argument types. 

6.1 Finding the Applicable Methods 

The methods that are applicable have the following two properties: 

— The name of the method is same as the call and the number of parameters 
is same as the arguments in the method call. 

~ The type of each argument can be converted to the type of corresponding 
parameter by method invocation conversion. 

This broadly means that either the parameter’s class is the same as the cor- 
responding argument’s Class, or that it is on the inheritance chain built from 
the argument’s class upto Object. If parameter is an interface, the argument 
implements that interface. We refer to [ 2 ] for a detailed description of this mech- 
anism. 

6.2 Finding the Most Specific Method 

Informally, methodl is more specific than method2 if any invocation handled 
by methodl can also be handled by method2. 

More precisely, if the parameter types of methodl are Mu to Mi„ and param- 
eter types of method 2 are M21 to M2„ methodl is more specific then method 2 
if Mij can be converted to M2j for allj from 1 to n by method invocation con- 
version. 

6.3 Overloading Ambiguity 

In case no method is found to be most specific then method invocation is am- 
biguous and a compile time error occurs. 

Example: 

Consider class A superclass of B and two methods with name m. 

m(A,B) 

m(B,A) 

Now an invocation which can cause the ambiguity is. 
m(instance of B, instance of B) 

In this case both method are applicable but neither is the most specific as 
m(instance of A, instance of B) can be handled only by first one while m(instance 
of B, instance of A) can be handled only by second one i.e. either of the method’s 
all parameters can not be converted to other’s by method invocation conversion. 
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6.4 Example: Method Resolution at Compile Time 

Method resolution takes place at compile time in Java and is dependent on the 
code which, calls the method. This becomes clear from the following example. 

Consider two classes Super and Sub where Super is superclass of Sub. Also 
consider class A with a method m and class Test with a method test, the code 
for the classes looks like this: 

Super .java 

public class Super {} 

Sub.java 

public class Sub extends Super {} 

A. java 

public class A { 

public void m(Super s) { System. out .printlnC'super") ; } 

} 

Test.java 

public class Test { 

public static void test(){ 

A a=new A() ; 
a.mCnew SubO ) ; 

> 

> 

On invocation of method test() of class Test, method m(Super) of class A is 
invoked and super is printed out. Let’s assume that we change the definition of 
the class A and overload the method m(Super) with method m(Sub) such that 
A looks like this: 

A. java 

public class A { 

public void mCSuper s) {System. out. printlnC'super") ;} 
public void m(Sub s) {System. out .printlnC'sub") ; } 

> 

If we recompile, and run our test method again, we expect sub to be printed out 
since m(Sub) is more specific than m(Super) but actually super is printed out. 
The fact is method resolution is done when we are compiling the file containing 
the method call and when we compiled the class Test we had the older version 
of class A and Java had done resolution based on that class A. We can get the 
expected output by recompiling class Test, which now views the newer version of 
class A and does the resolution according to that, and hence we get the expected 
output sub. 
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7 Finding a Most Specific Method at Runtime 

We will follow a simple algorithm. Let’s assume that the number of methods 
which are accessible and have same name and number of parameters as the call 
is M (small constant) and the number of arguments in the call is A (a small 
constant). Let us assume that the maximum inheritance depth of the class of an 
argument from Object down to itself in the class hierarchy tree is D (a small con- 
stant) It can be trivially shown that the complexity of our algorithm is bounded 
by 0(M * A * D). Our algorithm mimics exactly the functionality of Java 
and the following example would run exactly the same on both Java and our 
interface, the only difference being that since Java does the resolution at compile 
time, in case of an ambiguous call Java would report a compile time error while 
we do the same thing at runtime and hence, throw an exception with appropriate 
error message. So if class A looks like this: 

A. java 

public class A { 

public void mCSuper si, Sub s2) {System.out.printlnC'super") ;} 
public void m(Sub si, Super s2) {System. out. printlnC'sub") ;} 

} 

and the class Test looks like this: Test.java 

public class Test { 

public static void test(){ 

A a=new A() ; 

a.mCnew Sub(),new SubO); 

} 

} 

then Java will not compile class Test and give an error message. In our case 
there is no such thing as the class Test, but the equivalent of the above code 
would look like follows: 

new_java_class( ’A’ ,Aclass) , 
new_java_obj (Aclass, Aobject) , 
new_java_class( ’Sub’ , Subclass) , 
new_java_obj (Subclass , Subobjectl) , 
new_java_obj (Subclass , Subobject2) , 

invoke_java_method(Aobject,m(Subobjectl,Subobject2) , Return) . 

The result will be an ambiguous exception message and the goal failing with no. 

Our Algorithm We will now describe our algorithm in detail: 

CorrectMethodCaller(methodName, Arguments [ ] /*Size A*/) 

1. Method = get_exact_match_ref lection (methodName , Arguments [ ]) 

2. If Method != null 
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3. return invoke (Method, Arguments [ ]) 

4. MethodArray[ ] = get_methods(methodName,A) /*Size M*/ 

5. MethodParameterDistanceArray [ ][ ] = {infinity} /*Size M*A*/ 

6. For m = 0 to M 

7. MethodParameterArray [m] [ ] = 

MethodArray [m] .get_method_parameters() /*Size M*A*/ 

/♦Finds distances of method parameters from the arguments 
and stores in the array*/ 

8. For a = 0 to A do 

9. DistnceCounter = 0 

10. While Arguments [a] .type != null do /*Loops over D*/ 

11. For methods m = 0 to M do 

12. If MethodParameterArray [m] [a] == the Arguments [a] .type 

13. MethodParameterDistanceArray [m] [a] = DistanceCounter 

14. Arguments [a] .type = Super (Arguments [a] .type) 

15. DistanceCounter = DistanceCounter + 1. 

/♦Find the method with minimum distance of parameters from arguments*/ 

16. Minimum = infinity 

17. For m = 0 to M do 

18. Sum = 0 

19. For a = 0 to A do 

20. Sum = Sum + MethodParameterDistanceArray [m] [a] 

21 . If Sum < Minimum 

22. mChosen = m 

/♦Check if our selection is correct*/ 

23. For m = 0 to M do 

24. If m == mChosen 

25. continue 

/♦Skip those methods in which atleast one parameter never occurs 
in the inheritance hierarchy from the argument to Object*/ 

26. For a = 0 to A do 

27. If MethodParameterDistanceArray [m] [a] == infinity break 

28. If a < A cotinue 

/♦Check if "most specific method condition" is violated by mChosen*/ 

29. For a = 0 to A do 

30. If MethodParameterDistanceArray [m] [a] < 

MethodParameterDistanceArray [mChosen] [a] 

31. Throw ambiguous exception 

32. return invoke (MethodArray [mChosen] , Argument s [ ]) 

8 An Example Application/GUI Using Reflection API 

This GUI has almost completely been implemented in Prolog using the reflection 
API. A special builtin which, allows us to redirect output to a string is used 
to interface default Prolog i/o to textfield/textarea etc. The total Java code 
is less than 10 lines. Jinni provides, on the Java side, a simple mechanism to 
call Prolog Init.jinni( “Prolog command”). Since we do not have references to 
different objects in the Java code, but everything is in the Prolog code, we need 
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Fig. 2. Screenshot of Prolog IDE written in Prolog 



a mechanism to communicate between Java’s action-listener and Prolog. Jinni’s 
Linda blackboards have been used for this purpose [5, 7]. Whenever an action 
takes place the Java side calls Jinni and does a out with a number for type 
of action on the blackboard by calling something like Init.jinni(“out(a(l))”). 
On the Prolog side we have a thread waiting on the blackboard for input by 
doing an in(a(X)). After the out variable X gets unified with 1 and depending 
on this value, Prolog takes the appropriate action and again waits for a new 
input. Hence, we can make action events such as button clicks communicate 
with Prolog. The code for button “send” in the Appendix B shows exactly how 
this is done. 



9 Related Work 

Here we discuss a few other approaches followed for interfacing Prolog to Java 
using reflection or the Java Native Interface (JNI), and also a Scheme interface 
to Java using reflection. First, Kprolog’s JIPL package provides an interesting 
API from Java to a C-based Prolog and has a more extensive API for getting and 
setting fields. It also maps C-arrays to lists. The Kprolog’s JIPL has dynamic 
type inference for objects, but the problem of method signature determination 
and overloading has not been considered in the package [3] . 

SICStus Prolog actually provides two interfaces for calling Java from Prolog. 
One is the JASPER interface which uses JNI to call Java from a C-based Prolog. 
To obtain a method handle from the Java Native Interface requires to specify 
the signature of the method explicitly. So JASPER requires the user to specify 
as a string constant the signature of the method that the user wishes to call. 
This transfers the burden of finding the correct method to the user [4], who 
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therefore needs to know how to specify (sometimes intricate) method signatures 
as Strings. 

SICStus Prolog also has another interesting interface for calling Java from 
Prolog as a Foreign Resource. When using this interface the user is required to 
first declare the method which he wants to call and only then can the user in- 
voke it. Declaring a method requires the user to explicitly state the ClassName, 
MethodName, Flags, and its Return Type and Argument Types and map it to 
a Prolog predicate. Now the Prolog predicate can be used directly. This feature 
makes the Java method call look exactly like a Prolog builtin predicate at run- 
time - which keeps the underlying Java interface transparent to, for instance, a 
user of a library. ( This is very much similar to our old Builtin Registration and 
Execution mechanism, with one difference: here registration or declaration is on 
the Prolog side, while we were doing the same on Java side - for catching all 
errors at compile time.) The interface still requires the programmer to explicitly 
specify types and other details as the exact method signature [4]. 

Kawa Scheme also uses Java reflection to call Java from Scheme. To invoke 
a method in Kawa Scheme one needs to specify the class, method, return type 
and argument types. This gives a handle to call the method. Now the user can 
supply arguments and can call this method. Again, the burden of selecting the 
method is left to the user as he specifies the method signature [1]. 

In our case, like JIPL and unlike other interfaces, we infer Java types from 
Prolog’s dynamic types. But unlike JIPL, and like with approaches explicitely 
specifying signatures, we are able to call methods where the argument type is not 
exactly same as the parameter type. Hence, our approach mimics Java exactly. 
The functionality is complete and the burden of specifying argument types is 
taken away from the user. 



10 Future Work 

Future work includes extending our API, as currently we do not support getting 
and setting fields and arrays. Another interesting direction which is a conse- 
quence of the development of a reflection based API, is the ability to quickly 
integrate Java applications. We have shown the power of the API with the sim- 
ple GUI application. Such applications can be built either completely in Java 
with an API based on methods to be called from Prolog, or almost completely 
in Prolog using only the standard packages of Java. 

Jinni 2000 has support for plugins such as different Network Layers (TCP- 
IP and multicast sockets, RMI, CORBA) and a number applications such as 
Teleteaching, JavaSD animation tolkit developped with its conventional builtin 
interface. New applications and plugins can now be added by writing everything 
in Prolog while using various Java libraries. Arguably, the main advantage of such 
an interface is that it requires a minimal learning effort from the programmer. 
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11 Conclusion 

We have described a new reflection based Prolog to Java interface which takes 
advantage of implicit dynamic type information on both the Prolog and the 
Java sides. Our interface has allowed to automate data conversion between over- 
loaded method parameters, through a new algorithm which finds the most spe- 
cific method corresponding to a Prolog call. The resulting run-time reflective 
method dispatching mechanism provides accurate handling of overloaded meth- 
ods beyond the reflection package’s limitations, and is powerful enough to sup- 
port building a complete GUI library mostly in Prolog, with only a few lines of 
application specific Java code. 

The ideas behind our interfacing technique are not specific to Jinni 2000 - 
they can be reused in improving C-based Prolog-to-Java interfaces like JIPL or 
Jasper or even Kawa’s Scheme interface. Actually our work is reusable for any 
languages with dynamic types, interfacing to Java, as our work can be seen as 
just making Java’s own Reflection package more powerful. 
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Appendix A: Example of Prolog Code for the Reflection 
Based Jinni GUI 

/* builds a simple IDE with GUI components*/ 
jinni_ide : - 

new_java_class( ’ JinniTyagiFrame ’ , JTF) , 
new_java_obj (JTF, new ( ’ Satyam Tyagi ’ ) ,NF) , 
invoke_ j ava_method (NF , show , _ Ans ) , 
new_java_class( ’ java. awt . Label’ ,L) , 
new_java_obj (L,new( ,NL) , 

invoke_java_method(NL, setBounds (30,50,20,30) , _A4) , 
invoke_java_method(NF,add(NL) ,_C4) , 
new_java_class( ’java. awt .TextField’ ,TF) , 
new_java_obj (TF, new ( ’type Prolog query here’),NTF), 
invoke_java_method(NTF, setBounds (50, 50, 300, 30) ,_A1) , 
invoke_java_method(NF,add(NTF) ,_C1) , 
new_java_class( ’java. awt .Button’ ,B) , 
new_java_obj (B ,new( ’ Send’ ) ,NB) , 

invoke_ j ava_method (NB , setBounds (400,50,50,30) , _A3 ) , 
invoke_java_method(NF,add(NB) ,_C3) , 
invoke_java_method(NB,addActionListener(NF) , _B4) , 
new_java_class( ’java. awt .Text Area’ ,TA) , 
new_java_obj (TA,new( ’results displayed here’),NTA), 
invoke_java_method(NTA, setBounds (50, 100,500,250) ,_A2) , 
invoke_java_method(NF,add(NTA) ,_C2) , 

bg(the_loop(NTA,NTF)) . "/, code not shown for the_loop/2 



Appendix B: Java Code for the Send Bntton in the 
Reflection Based Jinni GUI 

import java. awt.*; 
import j ava . awt . event .* ; 
import tarau. j inni .* ; 

public class JinniTyagiFrame extends Frame implements ActionListener{ 
public JinniTyagiFrame (String name) •[ 
super (name) ; 
setLayout (null) ; 
resize (600, 400) ; 

} 

public void actionPerf ormed(ActionEvent ae){ 

String click=ae . getActionCommandO ; 
if (click. equals ("Send") ){ 

Init . askJinniC'out (a(l))") ; 

} 

} 



} 
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Abstract. The PARMC system performs model checking for systems 
described in the XL language, a variant of CCS. Extending previous 
work by Dong and Ramakrishnan that compiled XL specifications into 
an optimized transition relation, we take their transition relation and 
compile it into <S-code, producing instructions for a new lightweight en- 
gine S that was custom designed for PARMC. Difficult constructs are 
handled by the XSB logic programming engine, enabling both general- 
ity (from XSB) and speed (from 5). States are maintained in a com- 
pressed representation, also improving memory utilization. Experiments 
were performed and showed that the anticipated speed and memory im- 
provements were achieved. 



1 Introduction 

Model checking, where one determines whether a desired property holds for a for- 
mally specified system, has become an important part of industrial practice. For 
instance, it is most computationally intensive activity during processor design at 
Intel[13]. Typically, the system is a finite-state communication protocol or hard- 
ware device, and the property is expressed in a temporal logic. For example, one 
might want to guarantee that a communication protocol cannot deadlock. Since 
the systems usually have many states, their state spaces are usually specified im- 
plicitly, by giving an initial state and high-level specification of the transitions 
the system can undergo. 

Due to the growing importance of model checking, it is important to con- 
sider the application of declarative programming to this area. For instance, the 
lazy functional language fi serves many purposes in the Intel verification tools: 
it is the specification language, the scripting language, and an implementation 
language[13]. Similarly, tabled logic programming (LP) serves many purposes 
in the XMC model checker[14]: XL, the specification language for XMC, is an 
attractive mixture of CCS [9] and Prolog. XMC’s implementation exploits the 
semantic relationship between model checking and tabled LP; in a sense, the 
tabled LP engine of XSB[15] is used as an efficient engine for performing the 
required fixed-point calculations. An initial implementation of XMC consisted 
of a few rules giving the operational semantics of XL, some Prolog terms repre- 
senting the XL specification, a few rules specifying the semantics of the modal 
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mu-calculus, and a term representing a mu-calculus formula. Then XSB was 
queried to determine whether the initial state of the system modeled the mu- 
calculus formula[10]. Despite its elegance, this implementation lacked the speed 
of model checkers that had a less declarative basis. However, Dong and Ramakr- 
ishnan have worked to overcome this gap by applying partial evaluation and 
other LP optimizations to the specification and formula[3, 4]. 

The goal of the PARMC project is further speed for XL model checking; 
we hope to produce an efficient, parallel model checker. In contrast, the XMC 
approach to model checking is entirely declarative, which makes it easy to experi- 
ment with various temporal logics and different specification languages. However, 
there is a niche for efficient hand-crafted programs that check XL specifications 
against properties given in a few popular temporal logics. 

PARMC’s main idea is to exploit the special characteristics of the XSB 
trans/3 relation produced by XMC’s XL compiler. This relation describes the 
transitions of the system and is usually straightforward. In Section 5, we show 
that state-generation costs are dominant in PARMC. However, we observe that 
state generation usually does not require the general machinery of a tabled LP 
engine. Instead, a stripped-down engine S represents each state in significantly 
less memory than XSB; moreover, S calculates new states several times faster 
than XSB. Rarely, however, there are certain special cases that a stripped-down 
engine cannot handle, and we rely on limited help from the XSB engine as a 
backup mechanism for these. This paper reports on this experience, especially 
the interplay between the powerful XSB engine and the lightweight machine S 
that quickly executes the common cases. On some simple examples, PARMC is 
usually 4-8 times faster than XMC and significantly more space efficient. 

The remainder of the paper is organized as follows: first. Sect. 2 considers 
the optimized transition relation obtained in [3]. In Sect. 3 we identify special 
properties of trans/3 that motivate the lightweight engine S, which is also 
described in that section. Compilation of trans into 5-code (and difficult XSB 
fragments) is the main topic in Sect. 4, while experimental results are given in 
Sect. 5. A discussion in Sect. 6 concludes the paper. 

2 Compiled XL 

In XMC, systems are described in XL [II], which is a variant of the value-passing 
CCS [9] that allows use of auxiliary Prolog predicates. Although a detailed un- 
derstanding of XL is not required for this paper, an example may be helpful and 
is given in Fig. 1. The system sys is the parallel composition of a sender s and 
a receiver r that communicate via shared channel Chan. The sender repeatedly 
chooses between two messages; its choice is then sent to r. One of the choices is 
an integer list whose sum is three, and upon receiving it, r does the observable 
yes action. (The syntax of XL continues to evolve; the documentation in [14] 
describes the semantics and current syntax of XL.) 

In the example, one sees the : : = operator that is used to define a process, 
the input and output operators ? and ! , and the process combinators I (par- 
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{* suin( [] ,0) . 

sum([H|T] ,R) sum(T,Rl), R is R1 + H. *} 
s(Chan) : := {Chan ! [1,1] # Chan ! [1,2]}; s(Chan). 
r(Chan) : := Chem ? X; sum(X,L); 

if (L =:= 3) then {action(yes) ; r(Chan)} 

else if (L == 4) then end else r(Chan) . 

sys ::= s(Ch) I r(Ch) . 

ae_yes += <->tt /\ [-yes] ae_yes. 



Fig. 1. This XL specification has a sender s and receiver r composed in parallel. 
The property ae_yes requires that the system eventually perform the yes action; 
note that ae_yes does not hold for this system: s may repeatedly send only [1,1]. 



allel composition), ; (sequential composition) and # (nondeterministic choice). 
As well, auxiliary Prolog definitions may be embedded in XL; these auxiliary 
predicates may then be used in process definitions. An example is the sum/2 
predicate in Fig. 1. Note also the free mixing of Prolog data structures (lists) 
and the mixture of Prolog operators such as == and =;=. 

Dong and Ramakrishnan[3] described how to compile XL specifications into 
Prolog, specifically, into a global transition relation trans/3 for the entire sys- 
tem. They observed that, for many interesting cases, the system in question can 
be transformed into the parallel composition of a fixed number of sequential com- 
ponents. In CCS (hence XL), the entire system has a synchronizing r transition 
whenever there are two sequential components Ci,Cj and a channel y where 
Ci has an out transition on y and Cj has an in transition on y. Dong and Ra- 
makrishnan determine all possible r transitions at compile time and thereby pre- 
compute the system-wide trans/3 relation. Then the XSB query trans(a,B,C) 
determines the pairs (B,C) of transition-labels and successor-states immediately 
reachable from a state term a. (Note that states are represented as XSB terms.) 
For the example in Fig. 1, there are nine relevant trans rules, and two illustrative 
ones are shown in Fig. 2. 



trans (sys (A,r3(B ,C) ) , act ion (yes) , sys(A,rO(B,C) ) ) . 

trans (sys (sO(E,F) ,rO(E,G) ) ,tau, sys(sO(E,F) ,r3(E,G) ) ) : -sum( [1,1] ,1) , 1= :=3 . 



Fig. 2. Two of the compiled trans rules for the example XL specification in 
Fig. 1. Functors r3 and rO correspond to distinct control points within sequential 
component r. Variables C and G match continuations for component r, variable 
F matches the continuation for s, whereas variable A matches entire sequential 
component s. 
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The compiled XL specification consists of trans/3, a startstate predicate 
that identifies the initial state, and whatever auxiliary Prolog predicates were 
used in the XL program. We use the trans/3 relation of the compiled XL spec- 
ification as the starting point for our work on PARMC^. 



3 PARMC 

PARMC exploits several observations about compiled XL that suggest the full 
power of a (tabling) Prolog system is not always required. First, state terms 
are assumed ground^. Thus, for trans/3 we can pattern-match on the current 
state instead of using full unification. Second, the number of distinct functors 
is small, and the integers appearing in the state are usually small. It should 
be possible to represent states concisely, but the term representation used by 
XSB is more general and wastes space in our application. Finally, trans bodies 
are mostly composed of some simple determinate goals that determine rule ap- 
plicability, followed by some simple determinate goals that compute sequential 
component(s) that have changed. Nevertheless, some trans rules may contain 
difficult goals, perhaps involving negation or (as in Fig 2) auxiliary rules. 

Some of the properties observed can be indicated to certain Prolog compilers 
via annotations, or they may be inferred by static analyses. However, the prop- 
erties exploited by a particular Prolog compiler are unlikely to include all useful 
properties that the application’s developer might discover. As well, a Prolog 
compiler’s complexity must increase with every additional property exploited. 
Thus, there is room for both optimizing LP systems and custom lightweight 
engines such as S, used by PARMC. 

State Generation — Overview: PARMC compresses state terms and com- 

piles the trans rules into sequences of instructions that manipulate the com- 
pressed state terms for common goals. In difficult rules, S will end up invoking 
the XSB engine for some goals. This is expensive, since we must compress and 
decompress terms for XSB. Fortunately, in our examples the difficult rules are 
few and PARMC makes it easy to add support for a new goal. For instance, the 
standard length/2 predicate originally required an escape to the XSB engine. 
It was a frequently used auxiliary, however, so a new instruction was added for 
S and a new compilation rule was added to the trans-rule compiler. Thus rules 
involving length/2 no longer necessarily required an expensive term decompres- 
sion and XSB invocation. Similarly, though negation in general is not handled 
by S, the trans rules contain many cases where the negated goal is something 
like “X = 3”, where X is already bound to an integer. A collection of compiler 
rules handles such special cases. 

^ See [6] (online) for more realistic examples of trams relations, S code, etc. 

Unlike CCS, XL allows non-tail calls and hence each sequential component main- 
tains its continuation. One example in [14] has continuations that contain variables, 
representing yet-to-be computed values that will be returned. Changes to the XL 
compiler may be required. 
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Compressed State Representation: In PARMC, each state term is com- 

pressed into a null-terminated byte sequence. The compressed format stores the 
linearized form of the term, corresponding to the tokens (integer, float, functor, 
or variable) of a pre-order traversal of the corresponding XSB term. Each token’s 
representation uses between one and five bytes, and while one byte is usually 
sufficient, up to four continuation bytes can be used to store a 32-bit quantity. 
Refer to Fig. 3. 



Token 

cont? cont? cont? 

0| typ^ value<0..4>| 1| value<5..11> . . . 1 | value <26.. 31> 

type values 

00 variable Compresscd Term 

01 integer 

10 float (unused) ^okenO ] token 1 token 2 ^ ...□() 

1 1 functor (value is symbol table index) 



Fig. 3. A compressed term is represented as consecutive tokens and is termi- 
nated by a null byte. Each token is represented by a token byte that contains 
the least-significant five bits of data, and the token byte is followed by zero to 
four continuation bytes. Each continuation byte adds seven more bits of greater 
significance than those yet seen, and the most significant bit distinguishes token 
bytes from continuation bytes. 



3.1 Successor State Stack Machine S 

Machine S executes the compiled trans rules and determines the successors of 
the current state. It has several components (see Fig. 4), including two stacks, 
and it operates as follows on the current state in its compressed representation: 
The machine selects a rule and executes the compiled 5-code for this rule. The 
S instructions for a rule are executed consecutively until one of them fails or 
the end of the rule is reached. Only straight-line code is required; on failure we 
may sometimes restart the rule’s code from the beginning, but we never need 
backtrack within the instruction sequence. If we reach the end of the rule, the 
contents of the output stack can be used to build a successor. Otherwise, the 
machine can be wiped clean (by resetting its two stack pointers) and a new rule 
tried. There is no need to reload the current state, initialize registers, or engage 
in memory management on failure. 

Note that some pointers into the compressed term go to (boxed) integer sub- 
terms. However, we prefer to store unboxed integers in the stacks and require 
them in registers. Thus, a boxed integer is unboxed when it is moved into a 
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register. The compiler guarantees that only a few integer-manipulation instruc- 
tions have to test for boxed integers. All other instructions can assume unboxed 
integers, and this is important because most trans rules contain integer manip- 
ulation.^ 



3.2 Instruction Set 

The instructions S executes may be classified as pattern-matching, data transfer, 
arithmetic/logical, relational, output, and prologism. In what follows, N is an 
integer data value, Nc indexes the children of some subterm (with the first child 
at index 1), indexes the registers, Nf indexes the PARMC symbol table (for 
instance, list constructor ./2 is always at index 0 in this symbol table). For 
details of the instructions, see [7, Appendix A] . 

Pattern-Matching Instructions: The pattern-matching instructions deter- 

mine whether a rule unifies with a compressed term, and they also move matched 
variables to registers. Before the code for a rule begins, a pointer to the current 
state term is pushed onto the operand stack; in the discussion below, let S be 
the (sub)term whose pointer is atop the operand stack. 

The pattern-matching instructions include 

— get(Nc), which pushes a pointer to the child of S onto the operand 
stack. 

— checktermCNj.) and checkterm_neg(Ni.) , which test whether S matches the 
subterm pointed to by the register. Both terms are ground and the 
representation is canonical, so byte-wise comparison suffices. 

— test(Nf) and test_not (Nf) , which test the functor of S against Nf. When 
one succeeds, the other would fail. 

— eqls, which tests whether the top two items on the operand stack are the 
same integer. This instruction must handle both boxed and unboxed integers. 

— markCNj.), which pops S into register N^. If S' is a boxed integer, the register 
receives it unboxed; thus the invariant is maintained that registers do not 
store boxed integers. 

Data 'Transfer Instructions: There are instructions that move values be- 

tween the registers and the operand stack and an instruction to discard from the 
operand stack. These instructions are 

— pushvar(Nr), which pushes a value (term or integer) from the specified 
register. Note that it leaves integers unboxed. 

— pop, which discards the top element of the operand stack. (Some instructions 
remove operands from the stack but others do not, anticipating their reuse.) 

— setvar(Nr), which pops the tagged value atop the stack into the specified 
register. Unlike mark, it requires that integers be already unboxed. Typically, 
setvar is used after an arithmetic computation. 

The XL compilation of if cond then block often replicates integer tests in cond; 
they are guards on trans rules arising from block. 
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Fragments.? 



Fig. 4. Machine S executes the code compiled from trans/3. At the bottom, 
the dashed line indicates the boundary of S. Boundary items SymTab and XSB 
Query are used only to interface with the XSB engine. 

In the registers and on the stacks, possible tags include b (term) and j) (unboxed 
integer). The additional 1^ (constructor) tag appears only in the output stack. 
Most terms are pointers into the current state, but terms returned from the 
XSB engine are also possible. The current-state term has been indexed so that 
we know where each subterm begins and ends, but this is not indicated above. 
During this indexing, various rules have been tagged as inapplicable. 

The pointer indicating the current instruction to be executed is shown with a 
dashed line; in our implementation, the “S code” is a sequence of function calls 
executed by the native CPU. Therefore, we need not maintain our own program 
counter. 
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Arithmetic and Logical Instructions: The instruction set also provides 

various arithmetic instructions that use the operand stack. They include 

— pushcon(N), which pushes an unboxed integer constant N, 

— add, sub, mul, idiv, imod, max and negate, some basic arithmetic op- 
erators on unboxed integers, 

~ bitwise or, and, rshift, Ishift and not operations. 

Relational Instructions: Besides eqls, we have 

— neqls, gtr, gte, less, leq, which perform standard relational tests on 
the two integers atop the stack. Clearly, these can fail. 

Term- Construction Instructions: After executing the code generated from 

the body of the rule, we construct the new-state term. Currently, we use a 
separate output stack on which the preorder term is built (in reverse order) and 
then copied to the new-state register. This process is unnecessarily complex and 
future implementations of PARMC will construct the new-state term directly, 
using modified versions of these instructions. 

The current method involves the following instructions: 

— outputvar(Nr), which pushes a register onto the output stack, 

— outputcon(N) , which pushes an unboxed integer constant, 

— setconstr(Nf), which marks a term with functor Nf by pushing a con- 
structor marker. 

— set_action(N) , which labels the transition to the new state^. 



Example: If the operand stack of S initially contained a pointer to the current 
state term in Fig. 4, then the state shown in Fig. 4 would have been reached 
after executing the following: 

test (3) ; ensures top-level functor is f3 
mark(O) ; pop pointer to whole state into register 0 
pushvar(O) ; put it back 

get (2) ; push pointer to f3(5,6). Stack depth now two. 

get (2) ; push pointer to boxed integer 6. Stack depth 3. 

mark(l) ; pop (and unbox) term 6 into register 1. 

get(l) ; push pointer to first argument o/f3(5,6), i.e., boxed 5 

pushcon(5) ; push unboxed 5. Stack depth now 



Prologism Instruction: The prologismCz, o, /) instruction queries the 

XSB engine. Parameter / indicates the rule number, i and o are integers repre- 
senting the respective numbers of inputs (from the operand stack) and outputs 
(to the operand stack) for the Prolog query. Machine S will initially construct 
an appropriate query for the f rag_/ predicate, converting compressed subterms 
into XSB’s native term representation. Continuing with the example in Fig. 2, 
we have one ouptut and no inputs in frag_l (A) : - sum( [1,1] , A) . If the query 

^ The current implementation properly handles only atomic actions, which it maps to 
integers. In the future, S will construct arbitrary ground terms as actions. 
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succeeds, the “output variables” will have been bound. A bound term is pushed 
in unboxed form, should XSB return an integer. Otherwise, it is compressed 
and a pointer is pushed. This causes some difficulties, since such pointers do 
not point into the compressed current-state term. Some instructions cannot use 
them, and the compiler must not generate such instructions. This is discussed 
further in Sect. 4. 

The prologism instruction is the only possible source of indeterminacy. If a 
compiled rule contains a prologism instruction that succeeds, S will later retry 
the rule, rather than advancing to the next rule. When S retries the rule, all 
work before the prologism is repeated, but the prologism does not construct 
a new query. Instead, it asks XSB for another answer to the previous query. 
Note that a given rule is allowed at most one prologism; all goals between the 
first and last uncompilable goals are put into the fragment for the XSB engine. 
Thus, S need not handle producer/consumer looping that might otherwise arise 
between two prologisms in a rule. 

The prologism instruction is expensive, so we should avoid generating it. 
Hence, the compiler checks for many easy special cases of things that are gener- 
ally impossible for S. For instance, not Q is easily compiled, if Q is 3*X =:= N. 
Also, it is easy to extend the compiler; sometimes one needs to add new special- 
ized instructions to S, sometimes a sequence of existing instructions suffices. 

Compound instructions: Unlike Fig. 1, an interesting XL system is likely the 

parallel composition of mon?/ components; for example, sys ::= I 2 ■ ■ ■ ? 20 -Yet 
the interleaving semantics of XL and CCS [9] implies any system- wide transition 
has only one li change. The only exception are r transitions, in which two com- 
ponents change. Either way, most of the system-wide state is unchanged and a 
typical compiled rule would be trans (sys (11_0 (...), 12_0 (...), A3 , A4 , ... , 
A20) , tau, sys (11_1 (...) , 12_1 (...), A3, . . . , A20) ) . Using the instructions 
discussed so far, the 5-code will be dominated by sequences of gets and marks 
during pattern matching and sequences of outputvars during new-state con- 
struction. A peephole optimization replaces such sequences by getmark_batch 
and outputvar .batch instructions. As well, it collapses adjacent gets and tests 
into a get_and_test_child instruction. The getmark_batch(Nc,Nr,N) instruc- 
tion is equivalent to the 2N instructions get(Nc) ; mark(Nr) ; get(Nc -f 1) ; 
markCNr -|- 1) ; . . . get (Nc -I- N — 1) ; mark (N^ -|- N — 1) ; the outputvar .batch 
instruction is similar. On the Leader? specification [14], the unoptimized S code 
had about 6150 instructions, which the optimization reduced by 60%. Execution 
time (to generate all reachable states) improved by 14%. Since the same work 
is done with fewer instructions, this demonstrates that the useful work is more 
expensive than overheads such as fetching instructions. 

Library and Extension Instructions: Some instructions were added later, 

to reduce the frequency of the expensive prologism instruction. For Leader?, 
84% of the rules used the library length predicate. 50% used strip Jrom.end, 
an auxiliary that removes the last element in a list and returns both it and the 
new list. Although a cleaner extension facility might be envisioned, two new 
instructions were added directly to the core. Computing length on our ground 
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lists (and failing on mal-formed lists) required about ten lines of C. An instruc- 
tion for strip_f rom_end required about 25 lines of C. Modifying the compiler 
to emit these instructions was almost trivial, and a 16-fold speed improvement 
was obtained: compressing and decompressing lists is slow work. 



Example: Code for the rules 
Code for the first rule: 
get_and_test_child(2,2) ; 
getmark_batch(l ,0 ,2) ; popO; 
getmark.bat ch ( 1 , 2 , 1 ) ; 
outputvar_batch(l ,0) ; 
setconstr (4) ; outputvar(2) ; 
setconstr (3) ; set_action(2) ; 



in Fig. 2 follows. 

Code for second rule: 
get_and_test_child(l ,5) ; 
getmark_batch(l ,0 ,2) ; popO; 
get_and_test_child(2,4) ; get(l); 
checkterm(O) ; getmark_batch(2,2, 1) ; 
popO; prologismCO , 1 , 1) ; setvar(3); 
pushvar(3); pushcon(3) ; eqlsO; 
outputvar (2) ; outputvar (0) ; setconstr(2) ; 
outputvar_batch(l , 0) ; setconstr (5) ; 
setconstr (3) ; set_action(3) ; 



Indexing: Prolog systems often use indexing to determine a small subset of 

rules that may match a goal. Machine S also uses this optimization, especially 
since we already must pre-process any state that is loaded into the current-state 
register®. Recall that our systems are the parallel composition of (usually se- 
quential) components. While pre-processing the current state, we inspect the 
top-level functor of each component. During compilation, “disqualification sets” 
have been built up; for every pair (m, /) of sequential component m and top- 
level functor /, we have a set that specifies the rules that cannot apply. (Since 
the top-level functor for a sequential component corresponds to a control point 
in the XL specification, we are immediately disqualifying rules that require the 
sequential component to be elsewhere in its code.) Thus, while pre-processing 
the current state, we take the union of the relevant disqualification sets, and S 
avoids invoking the code for these inapplicable rules. On Leader?, this disquali- 
fied over 70% of the rules but reduced execution time only by about 8%. Since 
the overhead of computing the inapplicable rules was about 2%, it seems that 
inapplicable rules failed quickly. 



4 Compiler and Model Checker 

The compiler (about 1000 line of Prolog) turns the trans/3 relation into S 
code. For any XSB relation trans/3 and any XSB term t, S should execute that 
code and compute the same answers that would be obtained by the XSB query 
transCt, Ans), provided that t is ground and all answers are ground, and 
provided floats are not used (this restriction is inessential). Of course, it is only 
important to emulate XSB on trans relations that arise from XL compilation. 
However, arbitrary XSB fragments are permitted in XL programs; compilation 
often propagates them into trans. Our compiler generates S instructions to em- 
ulate XSB when that is easy; when that is difficult, it will generate a prologism 

® This is so that the beginning and length of each subterm is known, and so that the 
get instruction requires 0(1) time. 
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instruction that makes XSB do the work itself — a perfect emulation! For more 
details on compilation, see [7, Appendix B]. 

When compiling a rule, the first task is to handle pattern matching. Consider 
a particular rule, trans(a;,y,z) body. The compiler first emits instructions 
that match against pattern x. If x is nonlinear, the first occurrence of some 
variable Y results in a markvar instruction; later occurrences of Y result in 
checkterm. To promote early failure, after matching a functor, we generate 
pattern-matching code for its non- variable subterms before we bind its variable 
subterms. 

Next, body is compiled. During this, the compiler remembers that terms that 
will be constructed by the prologism instruction are second-class and the reg- 
isters that will point to them are tagged as tainted. Whether a particular goal 
is compilable (into prologism-less S code) depends not only on the goal itself, 
but also on the taintedness of its operands. Should we find a first uncompilable 
goal, a prologism is unavoidable. And, since we permit only one prologism 
per rule, we must then find the earliest goal Q, such that all goals after (and 
including) Q are compilable. To determine whether Q is compilable is easy: we 
guarantee that no compilation rule (itself a Prolog rule) will succeed unless it 
would produce code that can be guaranteed correct. (For instance, a rule in- 
volving a bound variable used within Q will fail, if the variable is tainted and 
must be used in an unsafe way. Failure of this rule may propagate backward, 
eventually marking Q uncompilable.) Thus, after the first uncompilable state- 
ment generates a prologism, we see whether compilation can succeed for all the 
remaining goals — simply try it. If so, the code for the Prolog fragment is over. If 
not, the Prolog fragment is extended to include Q. The process continues until 
we find that Q and all goals after it are compilable. Then this Prolog fragment is 
emitted and the S code for the rest of the rule is generated. (The computational 
complexity may be exponential in the number of goals in a rule; fortunately, 
rules typically have few goals.) 

This approach is quite pleasant for maintenance. By adding new compStmt 
rules we can extend when the 5-code compilation succeeds, and the generation of 
Prolog fragments decreases automatically. The compiler’s efficiency could easily 
be improved: compiling the 84 trans rules of Leader? into 5-code takes about 
7.5s; compiling 5 code (as C function calls) into machine language takes about 
Is. (In comparison, XL compilation takes less than a second.) 

4.1 Checking Temporal Formulae 

Constructing the successors of a particular state is only part of the work of 
PARMC, although it dominates the cost. Complex behavioural properties need 
to be expressed in some (temporal) logic, and a model checker must be able to 
verify them. In XMC, properties are usually expressed in the alternation-free 
fragment of the modal mu-calculus, although a declarative checker for the popu- 
lar logic CTL[1, 2, 5] was sketched in [12]. It is trivial to extend the CTL checker 
in [12] to the variant of CTL used by PARMC; however, PARMC chooses con- 
ventional, non-declarative techniques for CTL model checking. These lie outside 




348 



Owen Kaser 



the scope of this paper. Nevertheless, it is germane to consider the interaction 
of S with these conventional techniques. 

First, a few CTL examples are provided as motivation. To begin, consider 
the CTL formula EF /, for some CTL formula /. Informally, if s = EF / then 
“the system can (but possibly not inevitably) evolve from configuration s and 
eventually reach configuration s' , where s' = /.” (One writes s = / to indicate 
that property / is true when the system is in state s.) A useful property is 
AX false; informally, “all successors of this state have the property false”. Since 
there is no state for which s = false, s — AX false means “s is deadlocked”; 
s = EF AX false means “for configuration s, future deadlock is possible.” 

The conventional approaches for model checking treat the state-space as a 
directed graph in which certain subgraphs are sought, using well-known algo- 
rithms for strong connectivity, DFS, and so forth. For instance, we can deter- 
mine whether s = EF / by reachability: can a search (depth-first or otherwise) 
starting from s reach a node s' where s' = /? Of course, this requires knowing 
whether s' = /. A bottom-up approach was described in [2]; it is said to be 
“global” since it first computes the entire state space. Then the temporal for- 
mula is processed bottom-up, by determining (and recording) whether s = f', 
for every subformula f' and state s. With bottom-up processing, when we must 
determine s = EF /, we will know whether s' = f. Carefully implemented, CTL 
model checking is very efficient [2] . PARMC can do global model checking and the 
interaction between S and the graph routines is minimal: S initially builds the 
complete state space and terminates. Then the graph-traversal routines complete 
the task. 

An alternative approach requires more interaction between S and the graph- 
traversal routines. It is called “on-the-fly” or demand-driven model checking 
and may not generate all reachable states. Nor need the truth value of every 
subformula be calculated for every generated state. Instead, we use a top-down, 
goal-driven search: if demanded to resolve whether s = EF/, we might select 
some node s' reachable from s and demand whether s' = f. If indeed s' = /, 
we are done. We need not have generated any more of the state space than s, 
s', the path between them, and whatever portion of the state space was used 
to prove s' = f . Demand-driven disadvantages are irregularity and additional 
book-keeping time and space. For instance, we still must table the results of 
computation; now we must also check the tables to avoid repeating work. Also, we 
may need multiple sets of traversal markers per node: several different DFSs may 
be simultaneously underway. PARMC can do demand-driven model checking: the 
graph routines control the computation and states are marked as “expanded” or 
“unexpanded” . The graph routines will periodically give S an unexpanded state 
s and ask for its expansion. After S returns all states to which s has transitions, 
any new ones are added to the state space and marked as unexpanded. 

Since there may be multiple ways to determine whether s = f, the graph 
routines use heuristics to generate as little of the state space as possible. For 
instance, suppose state si is expanded but S 2 is not. If knowing either si = fi 
or S 2 = /2 would prove s = /, we will demand whether si = fi- That choice 
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appears more likely to confine the computation to the existing state space, and 
it also seems more likely to find and reuse memoized results. Note that S is 
oblivious to all these heuristics. 

5 Results 

This section discusses PARMC’s performance and its implementation, which is 
relatively small: The current system consists of about 4500 lines of C, over half 
of which is for S and its XSB interface. Less than a thousand lines is required 
for the demand-driven and global model checkers, and the remainder is due to 
concurrent data structures and multiprocessor support code (primarily memory 
management). The sequential system is complete, but work continues on the 
multiprocessor implementation. 

State Generation — PARMC vs XSB vs XMC: PARMC’s speed at cal- 

culating the state space was compared to tabled XSB’s speed® in computing a 
transitive-closure-type query on trans that generates and tables each reachable 
state. Both CPU-time and memory-consumption were measured. The values 
reported for PARMC are often dominated by the memory required by the XSB 
slave, which never reported less than 1.9MB. Statically allocated memory and 
code space may not be included, but they are known to be only a few hundred 
kilobytes. Tests are based on four trans relations (kindly supplied by Y. Dong) 
compiled from XL specifications ABP, Leader?, Meta23 and Sieve6 in the XMC 
Release[14]. Several of them were used in [3, 10] to compare XMC with other 
model checkers. 

Table 1 compares the speed with which XSB and PARMC construct the 
reachable states; it also compares the speed of XMC and PARMC in model 
checking^. For ABP and Meta23 the property is that deadlock cannot occur; 
for Leader?, that precisely one leader is always elected; for Sieve6, that action 
‘finish’ is always generated. These properties require inspection of all, or most, 
of the state space, and the PARMC results are due to the global model checker: 
it reports the state-generation time separately from the graph-traversal time. 

XSB and XMC space measurements are the totals reported by XSB, and 
include the table space, which is also tabulated separately. Various optimizations 
(see [10]) clearly help XMC avoid tabling the entire state space. However, the 
state-vector technique used in [3] undoes the “optimized state representation” 
from [10] and leads to larger table sizes than in [10, Table 2]. The difference 
between XSB’s table size and PARMC’s entire memory use is largely due to 
PARMC’s space-efficient state encoding. This difference is especially pronounced 

® Test environment: Pentium III, 450MHz, 128MB SDRAM, Linux 2.2.11 (uniproces- 
sor), XSB 2.1 (CHAT except SLG-WAM for XMC), gcc 2.?.2.3 with -03. 

^ The trans/3 relations obtained from Y. Dong do not exactly match those generated 
by the XL compiler in [14]. In all cases, they ran faster than those produced by 
the released XL compiler and consumed no more memory. Presumably, they were 
produced by a different version of the XL compiler. Tabulated results come from 
using Dong’s supplied trans relations with XMC. 
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Table 1. Comparing PARMC to XSB and XMC. 



Example 


States 


trans 

rules 


PARMC 


XSB 


XMC 


time 

(sec) 


Space 

(MB) 


check 

(sec) 


time 

(sec) 


Space 

(MB) 


table 

(MB) 


time 

(sec) 


Space 

(MB) 


table 

(MB) 


ABP 


260 


19 


0.07 


2.4 


0.01 


0.05 


2.1 


0.06 


0.03 


2.1 


0.06 


Leader? 


11939 


84 


1.79 


4.7 


0.44 


5.25 


13.8 


12.8 


9.42 


16.1 


2.1 


Meta23 


16955 


73 


1.19 


4.6 


0.06 


4.06 


12.2 


4.5 


9.63 


9.7 


2.1 


Sieve6 


1415? 


31 


0.83 


4.4 


0.11 


1.73 


6.7 


4.9 


4.44 


16.4 


2.4 



when one realizes that PARMC also records transitions and its figures include 
about 2MB overhead for the XSB slave. None of this overhead is related to state 
tabulation. 

Comparing Demand-Driven to Global Checking: As noted, the demand- 

driven checker has the potential to explore only a fraction of the state space, 
depending on the system and property being checked. To compare the two 
checkers, we reconsider the model-checking problem for Leader? that was used 
in Table 1. For the global checker, it took 1.8s of CPU time to generate the 
states (12k) and their transitions (52k). Only 0.44s was spent determining that 
the property held. 

Though the demand-driven checker is likely to expand fewer states, it may 
have more overhead, and it has a more complex interaction with S. Never- 
theless, the demand-driven checker took 1.86s in total, expanded all but one 
state, and calculated the truth values for virtually all subformulae. Thus, the 
demand-driven checker did as much work as the global checker, but was faster. 
Considering the additional overheads anticipated for demand-driven checking, 
this result warrants further investigation. 

6 Discussion and Conclusion 

In this section, we examine the features of XSB that made PARMC possible. 
As well, we examine whether our experience suggests a useful way to develop 
high-performance systems using declarative technology. 

What’s Special about XSB (for PARMC)? Surprisingly, the answer is 
not tabulation, unless the auxiliary predicates use it. Rather, the availability 
of an efficient C interface to XSB was critical. The XSB engine ran as a slave, 
processing queries dynamically constructed from prologism instructions. An 
interface that required the query be constructed as a string would have been 
disastrously slow. Instead, we were able to construct the query term piecemeal 
by invoking API routines that returned handles to XSB’s internal terms, and we 
dismantled the answers (XSB terms) returned similarly. When the XSB foreign- 
code interface was designed, it may not have been clear what operations should 
be provided. For instance, an earlier version of the interface allowed one to test 
whether a subterm was a variable, but did not describe how to determine whether 
two variables in a term were identical. At one stage in PARMC, this functionality 
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was required; fortunately, the API (but not its documentation) had been revised 
before work on PARMC began. 

Useful Development Paradigm? One might wonder whether the general 
approach used in PARMC can be (or has been) used for many other applica- 
tions. In our approach, an initial elegant system is constructed using a powerful 
declarative language. Then a variety of common cases are identified where the 
full power of the declarative language is not required. A simpler machine is then 
designed to run quickly and execute the mundane parts of the application, while 
having a mechanism to escape back to the declarative language when necessary. 
Finally, an easily extended compiler generates the code for the simpler machine, 
generating instruction to invoke the declarative language for difficult-to-compile 
constructs. 

Acknowledgements: I wish to thank the LMC and XSB groups at Stony 
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